Skip to content
View choiHkk's full-sized avatar

Block or report choiHkk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,109 391 Updated Jul 11, 2024

Reference-aware automatic speech evaluation toolkit

Python 168 14 Updated Dec 5, 2024

A next generation HTTP client for Python. 🦋

Python 14,705 971 Updated Oct 16, 2025

[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)

Jupyter Notebook 11,024 1,580 Updated Feb 12, 2025

Finetune Sesame AI's conversational speech model on new languages and voices. Blog post: https://blog.speechmatics.com/sesame-finetune

Python 90 9 Updated Sep 27, 2025

On-device TTS model by Neuphonic

Python 3,876 384 Updated Nov 4, 2025

Streaming and Fine-tuning for Chatterbox TTS

Python 206 43 Updated Jun 15, 2025

SoTA open-source TTS

Python 104 17 Updated Jun 7, 2025

SoTA open-source TTS

Python 22 1 Updated Jun 17, 2025

SoTA open-source TTS

Python 14,425 1,940 Updated Sep 25, 2025

zero-shot voice conversion & singing voice conversion, with real-time support

Python 3,380 396 Updated Apr 20, 2025

A simple, high-quality voice conversion tool focused on ease of use and performance.

Python 2,676 449 Updated Oct 30, 2025

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python 2,021 214 Updated Oct 9, 2025

VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency

Python 162 21 Updated Oct 26, 2025

Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- H…

Rust 1,316 103 Updated Apr 15, 2025

[ICML 2025] Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization

Python 102 5 Updated Jun 2, 2025

Build a Wake Word Detection model for Voice Assistant using PyTorch

Python 25 3 Updated May 9, 2023

OpenReview configuration for EMNLP 2025 demo papers

TeX 6 1 Updated Oct 15, 2025

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Python 288 37 Updated May 16, 2025

Recurrent neural network for audio noise reduction

C 5,118 1,010 Updated Feb 22, 2025

DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast voice synthesis.🐙

Python 45 4 Updated Nov 5, 2025

Artificial Neural Engine Machine Learning Library

Python 1,229 42 Updated Sep 2, 2025

Simultaneous speech-to-text model

Python 8,268 771 Updated Oct 30, 2025

Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs

Python 74 8 Updated Jul 18, 2025

TTS support with GGML

C++ 190 23 Updated Oct 5, 2025

An extremely fast Python linter and code formatter, written in Rust.

Rust 43,571 1,600 Updated Nov 5, 2025

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 3,661 247 Updated Sep 25, 2025

Educational implementation of the Discrete Flow Matching paper

Jupyter Notebook 122 7 Updated Aug 26, 2024

Discrete Flow Matching implemented in PyTorch

Python 30 2 Updated Mar 23, 2025

Frontier Open-Source Text-to-Speech

9,850 1,249 Updated Sep 5, 2025
Next