Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
-
Updated
Aug 12, 2024 - Python
Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech
Text to Speech using Coqui TTS + RVC
Free voice cloning for creators using Coqui XTTS-v2 on Google Colab. Clone your voice with just a few minutes of audio. Complete guide to build your own notebook.
A framework for AI WhatsApp calls using Whisper, Coqui TTS, GPT-3.5 Turbo, Virtual Audio Cable, and the WhatsApp Desktop App.
SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversational and interactive experience. It uses LLMs available through Ollama and has capabilities for extending functionalities through a modular tool system.
DoyenTalker uses deep learning techniques to generate personalized avatar videos that speak user-provided text in a specified voice. The system utilizes Coqui TTS for text-to-speech generation, along with various face rendering and animation techniques to create a video where the given avatar articulates the speech.
Open Translator: Speech To Speech and Speech to text Translator with voice cloning and other cool features
The TTS Platform leverages the power of Coqui TTS, an advanced open-source framework, to deliver a high-quality text-to-speech (TTS) experience. It caters to diverse user needs, offering natural-sounding voice generation with extensive customization options.
With this tool you can create custom TTS dataset from video or audio.
A lightweight voice companion, optimized for macOS.
(wip) python command-line Text-to-Speech (TTS) tool esp. for German, leveraging numerous endpoints like orpheus, piper, outetts, kokoro, csm, edge, coqui, kartoffelbox, etc
Local-first CLI that turns Markdown scripts into multi-speaker podcast-style audio using Coqui XTTS v2.
EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.
Training XTTS V2 and PEFT LORA Text-to-Speech (TTS)
An AI-powered backseat coach to fix your skill issue and/or ruin your day :). Supports popular models from OpenAI, Anthropic and Google and self-hosted. Customizable prompting and voice cloning thanks to ElevenLabs and Coqui TTS.
Synthesize speech using state-of-the-art open and closed-source tools
A lightweight, high-performance voice cloning TTS system based on Coqui TTS (XTTS v2), optimized for macOS (Apple Silicon) and Docker.
ChatGPT with Voice input and audio response.
Add a description, image, and links to the coqui-tts topic page so that developers can more easily learn about it.
To associate your repository with the coqui-tts topic, visit your repo's landing page and select "manage topics."