Skip to content
View 99-song's full-sized avatar

Block or report 99-song

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
12 stars written in Python
Clear filter

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 32,706 6,741 Updated Feb 6, 2026

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 19,497 2,198 Updated Feb 4, 2026

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,685 794 Updated May 27, 2025

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 6,159 659 Updated Aug 10, 2024

Text-to-Audio/Music Generation

Python 2,575 205 Updated Sep 29, 2024

Command line utility for forced alignment using Kaldi

Python 1,739 280 Updated Feb 2, 2026

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Python 1,673 214 Updated Jun 23, 2025

Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"

Python 1,226 102 Updated Sep 13, 2024

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 503 66 Updated Dec 22, 2025

Collection of pretrained models for the Montreal Forced Aligner

Python 185 25 Updated Oct 6, 2025

Extract phoneme-level timestamps from speeh audio.

Python 116 12 Updated Jan 15, 2026

Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis

Python 50 7 Updated Sep 20, 2025