- France
Stars
Training code and dataset cleasing with Sidon
A TTS model capable of generating ultra-realistic dialogue in one pass.
🚀 Efficient implementations of state-of-the-art linear attention models
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Faster Whisper transcription with CTranslate2
French instruction-following and chat models
Robust Speech Recognition via Large-Scale Weak Supervision
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
A walkthrough of transformer architecture code
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
A tokenizer, text cleaner, and phonemizer for many human languages.
[Does not work anymore!] Script to enable systemd support on current Ubuntu WSL2 images
🐸 - A general purpose model trainer, as flexible as it gets
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.
Simple but maybe too simple config management through python data classes. We use it for machine learning.
This repository contains the source code for the paper First Order Motion Model for Image Animation
TensorFlow port of first-order motion model. TF Lite and TF.js compatible, supports the original's checkpoints and implements in-graph kp processing, but inference only (no training).
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production