Highlights
AI Voice
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Instant voice cloning by MIT and MyShell. Audio foundation model.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
SubFix: Efficient Web-Based Audio Subtitle Editing and Multilingual Automatic Annotation Tool.
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.