Skip to content
View yurayli's full-sized avatar

Block or report yurayli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Speech & Voice ML

66 repositories

Robust Speech Recognition via Large-Scale Weak Supervision

Python 92,097 11,542 Updated Dec 15, 2025

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,611 280 Updated Jan 12, 2025

Faster Whisper transcription with CTranslate2

Python 19,509 1,628 Updated Nov 19, 2025

A PyTorch-based Speech Toolkit

Python 10,939 1,616 Updated Dec 15, 2025

We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multi…

Python 1,316 188 Updated Nov 16, 2023

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 8,854 984 Updated Dec 13, 2025

Manipulate audio with a simple and easy high level interface

Python 9,678 1,123 Updated Jul 26, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 19,214 2,044 Updated Oct 21, 2025

General Speech Restoration

Python 1,252 151 Updated Feb 17, 2025

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 7,648 697 Updated Dec 10, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,310 3,237 Updated Dec 17, 2025

Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation

Python 151 13 Updated Jan 16, 2024

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Jupyter Notebook 4,648 409 Updated Apr 3, 2024

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 13,959 1,446 Updated Dec 17, 2025

Official PyTorch Implementation of CleanUNet (ICASSP 2022)

Python 340 58 Updated Oct 11, 2023

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Python 8,324 1,910 Updated Sep 6, 2025

End-to-End Speech Processing Toolkit

Python 9,644 2,361 Updated Dec 16, 2025

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 9,576 1,367 Updated Apr 24, 2024

Recurrent neural network for audio noise reduction

C 5,222 1,017 Updated Feb 22, 2025

Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)

Jupyter Notebook 1,799 262 Updated Aug 19, 2025

Audio Normalization for Python/ffmpeg

HTML 1,457 125 Updated Nov 9, 2025

Trained neural networks and requisite information and data for rnnoise-nu

C 341 52 Updated Sep 2, 2018

Recurrent neural network for audio noise reduction, slightly improved for general use

C 125 24 Updated Apr 25, 2019

🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

1,952 98 Updated Nov 5, 2025

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 831 108 Updated Feb 15, 2025

A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.

Python 2,195 205 Updated Sep 26, 2025
Jupyter Notebook 8,757 626 Updated Oct 25, 2025

A lightweight library to compute Diarization Error Rate (DER).

Python 62 9 Updated Aug 28, 2023

Diarization scoring tools.

Python 260 46 Updated Mar 28, 2023

The collection of pre-trained, state-of-the-art AI models for ailia SDK

Python 2,295 351 Updated Dec 17, 2025