choiHkk

Follow

choihk choiHkk

Follow

SpeechSynthesis

70 followers · 20 following

Seoul
choihk6610@gmail.com

Achievements

Achievements

Stars

172 stars written in Python

facebookresearch / demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 9,445 1,326 Updated Apr 24, 2024

voicepaw / so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Python 9,179 1,221 Updated Nov 13, 2025

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 8,609 1,247 Updated Nov 10, 2025

QuentinFuxa / WhisperLiveKit

Simultaneous speech-to-text model

Python 8,430 795 Updated Nov 10, 2025

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,021 719 Updated May 31, 2024

jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,733 1,382 Updated Dec 6, 2023

1adrianb / face-alignment

🔥 2D and 3D Face alignment library build using pytorch

Python 7,426 1,380 Updated Aug 30, 2024

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 7,346 666 Updated Nov 10, 2025

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,121 392 Updated Jul 11, 2024

openai / consistency_models

Official repo for consistency models.

Python 6,436 434 Updated Mar 22, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 6,052 631 Updated Aug 10, 2024

MoonInTheRiver / DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,648 780 Updated Mar 19, 2025

neuphonic / neutts-air

On-device TTS model by Neuphonic

Python 3,953 393 Updated Nov 4, 2025

facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,831 342 Updated Jan 4, 2024

facebookresearch / flow_matching

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 3,707 251 Updated Sep 25, 2025

lucidrains / vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Python 3,680 300 Updated Nov 12, 2025

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,624 293 Updated Aug 14, 2025

Plachtaa / seed-vc

zero-shot voice conversion & singing voice conversion, with real-time support

Python 3,406 398 Updated Apr 20, 2025

KoljaB / RealtimeVoiceChat

Have a natural, spoken conversation with AI!

Python 3,334 368 Updated Jul 11, 2025

openai / glow

Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"

Python 3,175 523 Updated Jul 23, 2024

resemble-ai / Resemblyzer

A python package to analyze and compare voices with deep learning

Python 3,147 468 Updated Oct 12, 2023

haoheliu / AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,768 248 Updated Jun 25, 2025

IAHispano / Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

Python 2,682 455 Updated Nov 5, 2025

lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,604 279 Updated Jan 12, 2025

haoheliu / AudioLDM2

Text-to-Audio/Music Generation

Python 2,519 202 Updated Sep 29, 2024

nateshmbhat / pyttsx3

Offline Text To Speech synthesis for python

Python 2,438 354 Updated Nov 6, 2025

r9y9 / wavenet_vocoder

WaveNet vocoder

Python 2,367 496 Updated Jul 29, 2023

yxlllc / DDSP-SVC

Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)

Python 2,325 278 Updated Aug 16, 2025

allenai / longformer

Longformer: The Long-Document Transformer

Python 2,174 288 Updated Feb 8, 2023

archinetai / audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Python 2,079 178 Updated Jun 12, 2023