Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
-
Updated
Aug 12, 2024 - Python
Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech
Text to Speech using Coqui TTS + RVC
Free voice cloning for creators using Coqui XTTS-v2 on Google Colab. Clone your voice with just a few minutes of audio. Complete guide to build your own notebook.
Local-first CLI that turns Markdown scripts into multi-speaker podcast-style audio using Coqui XTTS v2.
A framework for AI WhatsApp calls using Whisper, Coqui TTS, GPT-3.5 Turbo, Virtual Audio Cable, and the WhatsApp Desktop App.
SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversational and interactive experience. It uses LLMs available through Ollama and has capabilities for extending functionalities through a modular tool system.
Open Translator: Speech To Speech and Speech to text Translator with voice cloning and other cool features
Professional local-first AI production pipeline for long-form narration. Clone voices and generate studio-grade audiobooks (M4B/MP3) using Coqui XTTS-v2 and support for Voxtral (cloud)
With this tool you can create custom TTS dataset from video or audio.
DoyenTalker uses deep learning techniques to generate personalized avatar videos that speak user-provided text in a specified voice. The system utilizes Coqui TTS for text-to-speech generation, along with various face rendering and animation techniques to create a video where the given avatar articulates the speech.
The TTS Platform leverages the power of Coqui TTS, an advanced open-source framework, to deliver a high-quality text-to-speech (TTS) experience. It caters to diverse user needs, offering natural-sounding voice generation with extensive customization options.
A lightweight voice companion, optimized for macOS.
python command-line Text-to-Speech (TTS) tool esp. for German, leveraging numerous endpoints like orpheus, piper, outetts, kokoro, csm, edge, coqui, kartoffelbox, etc
High-performance Coqui TTS API server with a hybrid "Hot/Cold" worker architecture
Training XTTS V2 and PEFT LORA Text-to-Speech (TTS)
AI Customer Support Agent with RAG + Voice A fully local, privacy-focused AI customer support system powered by open-source models. It supports multi-turn conversations, PDF/manual search (RAG), product recommendations, FAQ reasoning, and voice input/output using local STT + TTS. Built with Mistral 7B GGUF, FAISS, Whisper Tiny, Coqui TTS, FastAPI.
EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.
Add a description, image, and links to the coqui-tts topic page so that developers can more easily learn about it.
To associate your repository with the coqui-tts topic, visit your repo's landing page and select "manage topics."