Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,753 807 Updated Mar 25, 2026

shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 1,279 192 Updated Mar 16, 2026

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 20,521 2,355 Updated Mar 16, 2026

wangkai930418 / awesome-diffusion-categorized

collection of diffusion model papers categorized by their subareas

2,188 100 Updated Mar 16, 2026

acids-ircam / RAVE

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Python 1,709 219 Updated Mar 7, 2026

xingchensong / S3Tokenizer

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 509 68 Updated Dec 22, 2025

k2-fsa / ZipVoice

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 947 134 Updated Dec 2, 2025

ydqmkkx / ShallowFlowMatching-TTS

Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis

Python 52 7 Updated Sep 20, 2025

haoheliu / AudioLDM2

Text-to-Audio/Music Generation

Python 2,609 208 Updated Sep 29, 2024

teticio / audio-diffusion

Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.

Jupyter Notebook 788 78 Updated Sep 25, 2024

EricGuo5513 / momask-codes

Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"

Python 1,268 103 Updated Sep 13, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 6,238 670 Updated Aug 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SONGsong 99-song

Block or report 99-song

Stars

qaiu / netdisk-fast-download

huggingface / diffusers

ccfddl / ccf-deadlines

wandb / wandb

overleaf / overleaf

fishaudio / fish-speech

SWivid / F5-TTS

tabahi / bournemouth-forced-aligner

MontrealCorpusTools / mfa-models

MontrealCorpusTools / Montreal-Forced-Aligner

open-mmlab / Amphion