Skip to content
View asr-pub's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report asr-pub

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
132 stars written in Python
Clear filter

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 70,799 8,643 Updated Apr 30, 2026

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 57,080 6,230 Updated Apr 30, 2026

🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!

Python 48,656 6,153 Updated Apr 28, 2026

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 45,198 6,065 Updated Aug 16, 2024

Easily train a good VC model with voice data <= 10 mins!

Python 35,425 5,014 Updated Nov 24, 2024

PyTorch Tutorial for Deep Learning Researchers

Python 32,308 8,254 Updated Aug 15, 2023

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 32,212 6,679 Updated Sep 30, 2025

SOTA Open Source TTS

Python 30,014 2,537 Updated Apr 6, 2026

SoftVC VITS Singing Voice Conversion

Python 28,052 5,067 Updated Nov 11, 2023

Python logging made (stupidly) simple

Python 23,845 791 Updated Apr 6, 2026

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 20,819 2,397 Updated Mar 16, 2026

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 20,272 2,497 Updated Mar 16, 2026

Machine learning, in numpy

Python 16,346 3,766 Updated Oct 29, 2023

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Python 14,369 2,116 Updated Oct 27, 2025

微信机器人 / 可能是最优雅的微信个人号 API ✨✨

Python 14,269 2,382 Updated Jul 14, 2019

An open source implementation of CLIP.

Python 13,760 1,280 Updated Apr 30, 2026

A MNIST-like fashion product database. Benchmark 👇

Python 12,720 3,076 Updated Jun 13, 2022

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 12,593 1,956 Updated Apr 15, 2026

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 12,162 1,066 Updated Mar 8, 2026

Official implementation of AnimateDiff.

Python 12,106 1,067 Updated Jul 31, 2024

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 10,105 944 Updated Apr 28, 2026

The most powerful local music generation model that outperforms almost all commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.

Python 9,847 1,173 Updated Apr 30, 2026

End-to-End Speech Processing Toolkit

Python 9,821 2,400 Updated Apr 30, 2026

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Python 8,372 1,899 Updated Apr 10, 2026

Code for the paper "Jukebox: A Generative Model for Music"

Python 8,041 1,457 Updated Jun 19, 2024

Text-audio foundation model from Boson AI

Python 8,033 619 Updated Jan 18, 2026

Free Motion Capture for Everyone 💀✨

Python 7,516 632 Updated Apr 30, 2026

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

Python 7,004 2,252 Updated Oct 14, 2025

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, …

Python 5,812 957 Updated Aug 7, 2025

Inference and training library for high-quality TTS models.

Python 5,572 588 Updated Dec 10, 2024
Next