Stars
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Official Implementation of GLAP - General Language Audio Pretraining
VoiceBench: Benchmarking LLM-Based Voice Assistants
Official inference framework for 1-bit LLMs
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
partitioned block based frequency domain Kalman filter
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loudness normalization operations.
Different implementations of "Weighted Prediction Error" for speech dereverberation
Python library & examples for Masked Language Model Scoring (ACL 2020)
使用Bert,ERNIE,进行中文文本分类
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Official implementation of "Separate Anything You Describe"
E2E system with LF-MMI; word N-gram for Mandarin
ManyEars Sound Source Localization, Tracking and Separation
End-to-end ASR/LM implementation with PyTorch