Skip to content

tensorfuse/stts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

stts logo

STTS is an open-source audio inference server built specifcially for voice models. It lets you self-host the latest open-source STT/TTS models with streaming API in your own infrastructure.

✨ Supported Models

List of supported models along with the TTFB for each. If you want us to add more models, email at founders@tensorfuse.io

Model Name Type TTFB GPU Used
Orpheus(3B) TTS 180 ms 1xH100
Whisper (coming soon) STT ---- ----

⚡ Quickstart

  • Run with Docker (recommended) for Linux devices: SSH into a remote GPU server and run the following docker command in the terminal. It will start the streaming server on port 7004:
  • Note: Orpheus is a gated model on Huggingface, make sure you have access to it.
docker run --gpus=all \
  -p 8000:8000 -p 8001:8001 -p 8002:8002 -p 8003:8003 -p 7004:7004 \
  --shm-size=1g \
  -e HUGGING_FACE_HUB_TOKEN=hf_XXXX \
  -v "${HOME}/.cache/huggingface":/cache/huggingface \
  tensorfuse/stts:latest --mode tts --model orpheus

You can start streaming audio from the server using the python script below:

import requests
import sseclient  # Make sure this is from sseclient-py (pip install sseclient-py)
import logging
import base64
import wave
import json
import os
import sys

logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logger = logging.getLogger("sse-debug")
API_URL = "http://localhost:7004/v1/audio"
TEXT = "<giggle> The quick brown fox jumps over the lazy dog"
MODEL = "orpheus"
OUT = "debug_audio.wav"

def debug_sse():
    params = {"text": TEXT, "model": MODEL}
    logger.info(f"Requesting: {API_URL} with {params}")
    try:
        response = requests.post(API_URL, params=params, stream=True)
        logger.info(f"HTTP {response.status_code}")
        logger.info(f"Headers: {dict(response.headers)}")
        logger.info(f"Content-Type: {response.headers.get('content-type','')}")
        if 'text/event-stream' not in response.headers.get('content-type', ''):
            logger.warning("Server did NOT send the expected text/event-stream Content-Type!")

        client = sseclient.SSEClient(response)  # Do not read from response first!
        print("Connected to SSE, waiting for events:")
        all_bytes = bytearray()
        sample_rate = None

        for event in client.events():
            logger.info(f"Received SSE event: {event.data[:70]}{'...' if len(event.data) > 70 else ''}")
            if event.data == '[DONE]':
                logger.info("Stream finished, got [DONE].")
                break
            try:
                data = json.loads(event.data)
                if 'audio' in data:
                    chunk = base64.b64decode(data['audio'])
                    all_bytes.extend(chunk)
                    if not sample_rate:
                        sample_rate = data.get('rate')
            except Exception as e:
                logger.error(f"Failed to parse SSE data: {event.data} ({e})")

        if all_bytes and sample_rate:
            with wave.open(OUT, 'wb') as f:
                f.setnchannels(1)
                f.setsampwidth(2)
                f.setframerate(sample_rate)
                f.writeframes(all_bytes)
            logger.info(f"Saved audio to {os.path.abspath(OUT)}")
        else:
            logger.warning("No audio chunks or sample rate detected.")

    except Exception as e:
        logger.exception(f"Error during SSE client run: {e}")

if __name__ == "__main__":
    debug_sse()

About

This repo is built and maintained by Tensorfuse. We're building serverless GPU runtime that lets you self-host AI models of any modality in your own AWS.

If you're building AI agents, Voice Agents, Chatbots, etc and want to customize and deploy models, check out Tensorfuse

Below is the list of all resource to help you get started

🔗 Links and Resources

Type Links
Get Started Start here
📚 Documentation Read Our Docs
  Twitter (aka X) Follow us on X
🔮 Model Library SOTA models hosted on AWS using Tensorfuse
✍️ Blog Read our Blogs

For any enquries, email at founders@tensorfuse.io

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •