Skip to content
View asr-pub's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report asr-pub

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

ncnn android piper the fast and local neural text-to-speech engine

C++ 61 9 Updated Jan 14, 2026

The most powerful local music generation model that outperforms almost all commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.

Python 9,688 1,147 Updated Apr 24, 2026

《人人能懂的区块链》

Python 77 3 Updated Nov 29, 2025

Code for the paper "Jukebox: A Generative Model for Music"

Python 8,040 1,458 Updated Jun 19, 2024

Make text LLMs listen and speak

Python 1,276 221 Updated Mar 26, 2026

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Python 600 162 Updated Aug 19, 2023

High-quality speech synthesis with LoRA fine-tuning on index-tts, enhancing prosody and naturalness for single and multi-speaker voices.

Python 301 25 Updated Mar 12, 2026
Python 1,701 196 Updated Nov 15, 2025

Noise supression using deep filtering

Python 4,103 440 Updated Oct 17, 2024

Free Motion Capture for Everyone 💀✨

Python 7,480 628 Updated Apr 27, 2026

Added vLLM support to IndexTTS for faster inference.

Python 1,134 157 Updated Apr 13, 2026

Text-audio foundation model from Boson AI

Python 8,026 618 Updated Jan 18, 2026

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 965 140 Updated Dec 2, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 20,227 2,490 Updated Mar 16, 2026

[NeurIPS'24 Spotlight] Text2CAD: Generating Sequential CAD Designs from Beginner-to-Expert Level Text Prompts

Python 394 65 Updated May 15, 2025
Python 158 8 Updated Nov 22, 2024

All-In-One Music Structure Analyzer

Python 756 121 Updated May 9, 2024

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,290 111 Updated Mar 2, 2025

使用vllm加速cosyvoice2的推理

Jupyter Notebook 491 65 Updated Apr 26, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 4,028 321 Updated Apr 27, 2026

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Python 1,703 195 Updated Apr 20, 2024

从零实现一个小参数量中文大语言模型。

Python 1,011 115 Updated Aug 22, 2024

🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!

Python 48,418 6,105 Updated Apr 27, 2026

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 10,076 939 Updated Apr 24, 2026

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 20,777 2,387 Updated Mar 16, 2026

《现代汉语词典》(第7版)全文TXT

301 49 Updated Jun 22, 2024

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 511 68 Updated Dec 22, 2025

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 372 24 Updated Sep 3, 2024
Next