Stars
A local markdown preview server. npx mdts — and you're done.
JGLUE: Japanese General Language Understanding Evaluation
Long-form streaming TTS system for multi-speaker dialogue generation
OneShot Learning-based hotword detection.
Codename's rvc fork version 3, based on Applio.
litagin02 / Style-Bert-VITS2
Forked from fishaudio/Bert-VITS2Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
zero-shot voice conversion & singing voice conversion, with real-time support
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Multilingual Voice Understanding Model
Python interface to the WebRTC Voice Activity Detector
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
Library for building powerful interactive command line applications in Python
Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion
vits2 backbone with multilingual-bert
Faster Whisper transcription with CTranslate2