Skip to content
View cnlinxi's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report cnlinxi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Claude Code 源码文档解析

TypeScript 69 59 Updated Apr 1, 2026

分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等

Jupyter Notebook 1,640 134 Updated Apr 8, 2026

Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.

Python 179 16 Updated Mar 20, 2026

Pre-training, SFT, DPO and GRPO for Text-to-Audio Generation

Python 45 6 Updated Mar 13, 2026

A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD

Python 351 26 Updated Apr 4, 2026

Compute WER and SER for speech recognition evaluation

Python 27 3 Updated Mar 18, 2026

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 10,639 1,380 Updated Mar 17, 2026

Finetune Nemo parakeet ASR model with new language (support 8 bit optimizer). Experimental birwkv-fastconformer TDT for long-form ASR(8.5 hours in single pass).

Python 20 4 Updated Nov 27, 2025

Very fast, accurate speaker diarization

Python 251 27 Updated Mar 25, 2026

Nano vLLM

Python 12,864 1,919 Updated Apr 13, 2026

A demo-level low-latency, high-throughput inference engine for whisper

Python 19 4 Updated Nov 9, 2025

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.

Python 25 6 Updated Mar 30, 2026

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

Python 441 29 Updated Nov 27, 2025

Text-to-text alignment algorithm for speech recognition error analysis.

Python 28 4 Updated Apr 6, 2026

轻量级大语言模型MiniMind的源码解读,包含tokenizer、RoPE、MoE、KV Cache、pretraining、SFT、LoRA、DPO等完整流程

920 79 Updated Jun 16, 2025

A python module to repair invalid JSON from LLMs

Python 4,647 182 Updated Apr 13, 2026

Open-Source Frontier Voice AI

Python 39,323 4,557 Updated Apr 10, 2026
Python 561 82 Updated Mar 10, 2026

Simultaneous speech-to-text models

Python 10,088 1,040 Updated Mar 31, 2026
Python 570 54 Updated Aug 28, 2025

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enablin…

Python 1,268 123 Updated Mar 23, 2026

Text-audio foundation model from Boson AI

Python 8,019 616 Updated Jan 18, 2026

Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。

726 125 Updated Jun 12, 2020

UTokyo-SaruLab MOS Prediction System

Python 308 30 Updated Apr 2, 2026

A Collection of Papers on Diffusion Language Models

166 7 Updated Sep 15, 2025

Spark-TTS Inference Code

Python 10,962 1,168 Updated Apr 9, 2025

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

C 2,074 164 Updated Feb 2, 2026

中文文本纠错数据集汇总

Python 36 11 Updated Mar 24, 2026
Python 478 42 Updated May 19, 2025
Next