Highlights
- Pro
Lists (26)
Sort Name ascending (A-Z)
AcousticFrontend
AcousticModel
ASR
ASR-pretrain
ASV
AudioQuality
AwesomeList
Paper list, awesome list and so on.BandwidthExtension
Classification
Codec
Data
Develop
Evaluation
FrontEnd
FrontEnd for Text-to-SpeechHow-to
LLM
Music
Performance
Quant
SingingVoiceSynthesis
SpeechEditing
SpeechSeperation
Tools
Universal Method
Vocoder
VoiceConversion
Starred repositories
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
MPI programming lessons in C and executable code examples
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
Audio decoding libraries for C/C++, each in a single source file.
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
Voice activity detection (VAD) library, based on WebRTC's VAD engine
这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。 支持的模型是由Google的Transformer模型中优化而来,数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时), 所以识别效果也很好,可以媲美许多商用的ASR软件。
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
A Simple and Efficient Audio Resampler Implementation in C
webrtc中apm相关代码的提取,包括AEC/NS/AGC/VAD ,另外还包括mp3/aac编码器、SoundTouch