Skip to content
View 99-song's full-sized avatar

Block or report 99-song

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

聚合多种主流网盘的直链解析下载服务, 一键解析下载,已支持夸克网盘/uc网盘/蓝奏云/蓝奏优享/小飞机盘/123云盘/移动/联通/天翼云/wps等. 支持文件夹分享解析. 体验地址: https://189.qaiu.top

Java 2,754 238 Updated Apr 12, 2026

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 33,308 6,914 Updated Apr 12, 2026

⏰ Agenticly track worldwide conference deadlines (Website, Python Cli, Wechat Applet)

Rust 8,867 587 Updated Apr 12, 2026

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Python 10,998 857 Updated Apr 11, 2026

A web-based collaborative LaTeX editor

JavaScript 17,550 1,947 Updated Apr 10, 2026

SOTA Open Source TTS

Python 29,276 2,466 Updated Apr 6, 2026

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 14,315 2,113 Updated Apr 4, 2026

Extract phoneme-level timestamps from speeh audio.

Python 125 13 Updated Apr 2, 2026

Collection of pretrained models for the Montreal Forced Aligner

Python 193 25 Updated Mar 31, 2026

Command line utility for forced alignment using Kaldi

Python 1,789 288 Updated Mar 31, 2026

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,753 807 Updated Mar 25, 2026

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 1,279 192 Updated Mar 16, 2026

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 20,521 2,355 Updated Mar 16, 2026

collection of diffusion model papers categorized by their subareas

2,188 100 Updated Mar 16, 2026

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Python 1,709 219 Updated Mar 7, 2026

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 509 68 Updated Dec 22, 2025

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 947 134 Updated Dec 2, 2025

Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis

Python 52 7 Updated Sep 20, 2025

Text-to-Audio/Music Generation

Python 2,609 208 Updated Sep 29, 2024

Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.

Jupyter Notebook 788 78 Updated Sep 25, 2024

Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"

Python 1,268 103 Updated Sep 13, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 6,238 670 Updated Aug 10, 2024