huutuongtu

😀

Huh?

Huu Tuong Tu huutuongtu

😀

Huh?

Strygwyr

16 followers · 62 following

Vietnam

Achievements

Lists (16)

Sort

Stars

289 stars written in Python

Clear filter

facebookresearch / blt

Code for BLT research paper

Python 2,006 180 Updated Nov 3, 2025

etched-ai / open-oasis

Inference script for Oasis 500M

Python 1,974 166 Updated Nov 8, 2024

LAION-AI / CLAP

Contrastive Language-Audio Pretraining

Python 1,885 191 Updated May 15, 2025

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,782 107 Updated Sep 27, 2024

Standard-Intelligence / hertz-dev

first base model for full-duplex conversational audio

Python 1,768 112 Updated Jan 5, 2025

tsurumeso / vocal-remover

Vocal Remover using Deep Neural Networks

Python 1,726 251 Updated Jul 23, 2024

Omni-Avatar / OmniAvatar

Python 1,708 155 Updated Aug 6, 2025

undertheseanlp / underthesea

Underthesea - Vietnamese NLP Toolkit

Python 1,622 288 Updated Nov 7, 2025

descriptinc / descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,611 160 Updated Nov 4, 2025

FireRedTeam / FireRedASR

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 1,571 138 Updated Sep 22, 2025

yifan123 / flow_grpo

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,553 84 Updated Nov 4, 2025

mpezeshki / pytorch_forward_forward

Implementation of Hinton's forward-forward (FF) algorithm - an alternative to back-propagation

Python 1,487 143 Updated Sep 6, 2023

sihyun-yu / REPA

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,402 64 Updated Mar 16, 2025

dome272 / Diffusion-Models-pytorch

Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)

Python 1,384 305 Updated Sep 7, 2023

bytedance / music_source_separation

Python 1,368 201 Updated Apr 18, 2024

lucidrains / naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Python 1,332 105 Updated Sep 24, 2023

marl / crepe

CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)

Python 1,304 171 Updated Aug 19, 2024

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,224 104 Updated Mar 2, 2025

gitmylo / audio-webui

A webui for different audio related Neural Networks

Python 1,214 107 Updated May 19, 2025

Alexander-H-Liu / End-to-end-ASR-Pytorch

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.

Python 1,212 316 Updated Dec 19, 2020

stepfun-ai / Step-Audio2

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,201 85 Updated Sep 22, 2025

FireRedTeam / FireRedTTS2

Long-form streaming TTS system for multi-speaker dialogue generation

Python 1,198 106 Updated Oct 26, 2025

maum-ai / voicefilter

Unofficial PyTorch implementation of Google AI's VoiceFilter system

Python 1,166 239 Updated Jul 25, 2024

KinWaiCheuk / nnAudio

Audio processing by using pytorch 1D convolution network

Python 1,093 96 Updated May 16, 2025

lhotse-speech / lhotse

Tools for handling multimodal data in machine learning projects.

Python 1,079 255 Updated Oct 31, 2025

jiachenzhu / DyT

Code release for DynamicTanh (DyT)

Python 1,020 85 Updated Mar 30, 2025

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 982 76 Updated Dec 23, 2024

haidog-yaqub / MeanFlow

Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.

Python 925 54 Updated Oct 16, 2025

X-LANCE / SLAM-LLM

A Framework for Speech, Language, Audio, Music Processing with Large Language Model

Python 914 97 Updated Oct 24, 2025

Kyubyong / g2p

g2p: English Grapheme To Phoneme Conversion

Python 891 134 Updated Jan 5, 2023

Previous Next

Huu Tuong Tu huutuongtu

Lists (16)

Aligner

Audio Enhancement

DATASET

improve_model_architecture

Interactive AI

MDD

MLOPS

SE

Singing Voice

Speaker Diarization

Speech LLM

Speech quality assessment

Speech Separation

Speech Tokenizer

Tool

trader

Stars