Lists (31)
Sort Name ascending (A-Z)
AEC
acoustic echo cancellationAFX
audio effectASR
ASV-spoof
AVSE
audio-visual speech enhancementbinaural&spatial
BWE
band width extensionC/Python SPTK
Some speech toolkits which maybe big projectcodec
CT
coding techniqueDataset
contain dataset ==detector
detect everythingDOA
direction of arrivalInteresting
Something interesting for meKWS
key-word-spottingLight-Weighting
Get Model Light-weightLLM
MCSE
Multi-Channel Speech EnhancementMetric
metric listNN deploy
Neural Nerwok deploymentNS
noise suppressionOthers
some useful but trivial thingsPNS
Personalized Noise Suppressionrust
SER
Speech Emotion RecognitionSpatial audio
Speaker
About speaker verification, speaker diarization, speaker recognitionspeech accessment
SSSL
self-suprevised speech learningTTS
Text to SpeechVAD
Stars
Implementation of "Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition"
This project focuses on audio processing and filter simulation research. It uses Python for simulation experiments and C++ for engineering implementation, covering extensive machine learning practi…
SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline
https://deeplearning101.twman.org/Speech-Processing Speech Processing (語音處理)
Repo for Paper: Towards Robust Speaker Recognition against Intrinsic Variation with Foundation Model Few-shot Tuning and Effective Speech Synthesis
(ICASSP 2025, official code)FlowSE: Flow Matching-based Speech Enhancement
X-ASR is a series of automatic speech recognition models based on the icefall framework, focusing on streaming ASR and low-latency deployment.
Chinese text normalization for speech processing
Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claude Code, Codex, Cursor, Copilot, Gemini C…
Unofficial PyTorch implementation of "Moises-Light: Resource-efficient Band-split U-Net For Music Source Separation"
Daehwa Kim and Chris Harrison. "SoundBubble: Finger-Bound Virtual Microphone using Headset/Glasses Beamforming" CHI 2026
Bridge local AI coding agents (Claude Code, Cursor, Gemini CLI, Codex) to messaging platforms (Feishu/Lark, DingTalk, Slack, Telegram, Discord, LINE, WeChat Work). Chat with your AI dev assistant f…
基于两阶段的声学回声消除系统 A Two-Stage-Based Acoustic Echo Cancellation System
This is the repository for the work "DegVoC: Rethinking Neural Vocoder from a Degradation Perspective", which is accepted at AAAI 2026.
A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.
DeepVQE reimplementation in PyTorch and GGML — real-time acoustic echo cancellation with soft delay estimation
Lean neural real-time acoustic echo cancellation with soft delay estimation - GGML and PyTorch inference
AnyEnhance-based Baseline for the CCF-AATC 2025 Challenge Track 1
Official implemtation of UniverSR (ICASSP 2026)
Academic Research Skills for Claude Code: research → write → review → revise → finalize
DPDFNet: causal single-channel speech enhancement that boosts DeepFilterNet2 with dual-path RNN blocks for stronger long-range temporal and cross-band modeling. Repo includes PyTorch implementation…