huutuongtu

😀

Huh?

Huu Tuong Tu huutuongtu

😀

Huh?

Strygwyr

16 followers · 62 following

Vietnam

Achievements

Lists (16)

Sort

Stars

289 stars written in Python

Clear filter

stepfun-ai / Step-Audio

Python 4,541 362 Updated Jun 12, 2025

Blealtan / efficient-kan

An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

Python 4,498 402 Updated Aug 1, 2024

mosaicml / llm-foundry

LLM training code for Databricks foundation models

Python 4,350 575 Updated Oct 27, 2025

MoonshotAI / Kimi-Audio

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,324 308 Updated Jun 21, 2025

huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 4,226 485 Updated Apr 15, 2025

SystemErrorWang / White-box-Cartoonization

Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”

Python 3,996 742 Updated Oct 9, 2022

neuphonic / neutts-air

On-device TTS model by Neuphonic

Python 3,882 386 Updated Nov 4, 2025

facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,823 344 Updated Jan 4, 2024

facebookresearch / flow_matching

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 3,665 248 Updated Sep 25, 2025

lucidrains / vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Python 3,665 297 Updated Nov 5, 2025

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,600 291 Updated Aug 14, 2025

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,426 294 Updated Nov 5, 2024

resemble-ai / Resemblyzer

A python package to analyze and compare voices with deep learning

Python 3,143 468 Updated Oct 12, 2023

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 3,089 215 Updated May 19, 2025