Stars
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
PyTorch Implementation of FastDiff (IJCAI'22)
PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
Official PyTorch code for Deep Audio-Signal Holistic Embeddings
Generation scripts for EARS-WHAM and EARS-Reverb
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
Python implementation of performance metrics in Loizou's Speech Enhancement book
Conditional Diffusion Probabilistic Model for Speech Enhancement
Speech enhancement using Wiener filtering and pitch-synchronous STFT phase reconstruction
The MOS system combines components from DNSMOS, NISQA, MOSSSL, and SIGMOS, using the librosa library to process audio waveforms.
Denoising Diffusion Probabilistic Models
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
chazo1994 / Amphion
Forked from open-mmlab/AmphionAmphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
An Open-source Streaming High-fidelity Neural Audio Codec
HeCheng0625 / Amphion
Forked from open-mmlab/AmphionAmphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…