Lists (6)
Sort Name ascending (A-Z)
Stars
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech
A retargetable MLIR-based machine learning compiler and runtime toolkit.
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
An open source real-time AI inference engine for seamless scaling
GGUF Quantization support for native ComfyUI models
Diffusion model(SD,Flux,Wan,Qwen Image,...) inference in pure C/C++
A markup-based typesetting system that is powerful and easy to learn.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
The developer platform for on-demand cloud development environments to create software faster and more securely.
Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.
Speech To Speech: an effort for an open-sourced and modular GPT4-o
🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transf…
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
High-resolution models for human tasks.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Silero Models: pre-trained text-to-speech models made embarrassingly simple