Skip to content
View alexandergwm's full-sized avatar
  • Acoustic and Speech
  • Shenzhen
  • 10:32 (UTC +08:00)

Block or report alexandergwm

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI generates a real, editable PowerPoint from any document — native shapes & animations, speaker notes voiced as audio narration, and the option to follow your own .pptx template, not slide images …

Python 27,015 2,407 Updated Jun 10, 2026

Python implementation of OMLSA+IMCRA algorithm for speech enhancement.

Python 70 21 Updated Jun 29, 2021

An Automated AI Agent Tool for Plotting Your Data in Any Paper's Figure Style.

Python 1 Updated May 24, 2026

MCRA+OMLSA python version

Python 10 1 Updated Jul 11, 2024

A repository for Automatic Speech Recognition (ASR) that ensembles multiple open-source models to achieve SOTA quality of recognition. Useful if you need to get the maximum quality of recognition d…

Python 15 1 Updated May 20, 2026

Operator-level compressed GTCRN with ERB-CRM pipeline preserved and DPGRNN intact, ready for edge deployment.

Python 22 8 Updated Feb 11, 2026

A training code template for DNN-based speech enhancement.

Python 199 46 Updated Sep 4, 2025

This project focuses on audio processing and filter simulation research. It uses Python for simulation experiments and C++ for engineering implementation, covering extensive machine learning practi…

Jupyter Notebook 13 5 Updated May 21, 2026

Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claude Code, Codex, Cursor, Copilot, Gemini C…

TypeScript 58,158 4,842 Updated Jun 11, 2026

Traditional Speech Enhancement Methods

MATLAB 148 34 Updated Sep 28, 2025

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…

C 1,216 219 Updated May 8, 2026

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,206 333 Updated Sep 10, 2025

🧠「大模型」2小时完全从0训练64M的小参数LLM!Train a 64M-parameter LLM from scratch in just 2h!

Python 51,676 6,640 Updated Jun 1, 2026

🎙️ 「大模型」从0训练0.1B能听能说能看的全模态Omni模型!A 0.1B Omni model trained from scratch, capable of listening, speaking, and seeing!

Python 1,837 218 Updated Jun 8, 2026

Single Channel Speech Enhancement Methods and Toolbox

Python 54 14 Updated Apr 8, 2026

speech enhancement\speech seperation\sound source localization

1,240 224 Updated Nov 14, 2023
2 Updated Apr 25, 2026

A Survey of Continual Learning for Speech and Audio Models

5 Updated May 26, 2026

Audio Coding Notebooks and Tutorials

Jupyter Notebook 89 14 Updated Dec 16, 2020

Python implementation of performance metrics in Loizou's Speech Enhancement book

Python 455 93 Updated Feb 15, 2025

Models for DCASE 2026 Semantic Acoustic Imaging for Sound Event Localization and Detection from Spatial Audio and Audiovisual Scenes

Python 14 1 Updated May 28, 2026

This is the public repository for SALSA-Lite features for polyphonic sound event localization and detection using microphone arrays.

15 4 Updated Dec 3, 2021

You can find the speech algorithms you want here

C 865 262 Updated Jan 25, 2026

[CVPR 2025] Pytorch implementation of the paper "Hearing Anywhere in Any Environment"

Python 33 1 Updated Sep 18, 2025

[TASLP] Open-Vocabulary Sound Event Localization and Detection with Joint Learning of CLAP Embedding and Activity-Coupled Cartesian DOA Vector

Python 9 2 Updated Mar 25, 2026

A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD

Python 423 28 Updated May 6, 2026
Python 1 Updated Aug 17, 2025
Next