Lists (4)
Sort Name ascending (A-Z)
Stars
MAGIC-TTS: Fine-Grained Controllable Speech Synthesis with Explicit Local Duration and Pause Control
MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenario…
Code release for ConvNeXt V2 model
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
Simple and lightweight Zero-shot Text-to-Speech (TTS) synthesis model
This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in speech synthesis.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
The open source code for SimpleSpeech series
The official implementation of HierSpeech++
Command line utility for forced alignment using Kaldi
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
A generative speech model for daily dialogue.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)