choiHkk

Follow

choihk choiHkk

Follow

SpeechSynthesis

70 followers · 20 following

Seoul
choihk6610@gmail.com

Achievements

Achievements

Stars

167 stars written in Python

facebookresearch / demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 9,412 1,321 Updated Apr 24, 2024

voicepaw / so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Python 9,175 1,221 Updated Nov 6, 2025

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 8,602 1,245 Updated Nov 4, 2025

QuentinFuxa / WhisperLiveKit

Simultaneous speech-to-text model

Python 8,282 774 Updated Nov 6, 2025

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 7,993 716 Updated May 31, 2024

jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,725 1,382 Updated Dec 6, 2023

1adrianb / face-alignment

🔥 2D and 3D Face alignment library build using pytorch

Python 7,414 1,380 Updated Aug 30, 2024

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,110 391 Updated Jul 11, 2024

openai / consistency_models

Official repo for consistency models.

Python 6,432 434 Updated Mar 22, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 6,036 629 Updated Aug 10, 2024

MoonInTheRiver / DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,641 779 Updated Mar 19, 2025

neuphonic / neutts-air

On-device TTS model by Neuphonic

Python 3,879 386 Updated Nov 4, 2025

facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,823 344 Updated Jan 4, 2024

lucidrains / vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Python 3,665 297 Updated Nov 5, 2025

facebookresearch / flow_matching

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 3,663 247 Updated Sep 25, 2025

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,600 291 Updated Aug 14, 2025

Plachtaa / seed-vc

zero-shot voice conversion & singing voice conversion, with real-time support

Python 3,389 396 Updated Apr 20, 2025

KoljaB / RealtimeVoiceChat

Have a natural, spoken conversation with AI!

Python 3,313 364 Updated Jul 11, 2025

openai / glow

Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"

Python 3,174 523 Updated Jul 23, 2024

resemble-ai / Resemblyzer

A python package to analyze and compare voices with deep learning

Python 3,142 468 Updated Oct 12, 2023

haoheliu / AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,760 248 Updated Jun 25, 2025

IAHispano / Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

Python 2,678 450 Updated Nov 5, 2025

lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,601 278 Updated Jan 12, 2025

haoheliu / AudioLDM2

Text-to-Audio/Music Generation

Python 2,515 202 Updated Sep 29, 2024

nateshmbhat / pyttsx3

Offline Text To Speech synthesis for python

Python 2,434 354 Updated Nov 6, 2025

r9y9 / wavenet_vocoder

WaveNet vocoder

Python 2,367 496 Updated Jul 29, 2023

yxlllc / DDSP-SVC

Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)

Python 2,319 277 Updated Aug 16, 2025

allenai / longformer

Longformer: The Long-Document Transformer

Python 2,172 288 Updated Feb 8, 2023

archinetai / audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Python 2,080 178 Updated Jun 12, 2023

OpenBMB / VoxCPM

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python 2,031 216 Updated Oct 9, 2025