Skip to content
View leo19941227's full-sized avatar

Organizations

@s3prl

Block or report leo19941227

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows

Python 116 11 Updated Sep 2, 2025

PyTorch implementation of JiT https://arxiv.org/abs/2511.13720

Python 1,796 105 Updated Dec 8, 2025

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 334 45 Updated Jul 21, 2025

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,471 213 Updated Dec 16, 2025

Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"

Jupyter Notebook 164 4 Updated Dec 17, 2025

This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Lan…

Python 69 3 Updated Sep 21, 2025

This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Lan…

Python 195 13 Updated Sep 21, 2025

Streamable Text-to-Speech model using a language modeling approach, without vector quantization

Python 106 7 Updated May 20, 2025

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 6,427 703 Updated Dec 17, 2025

A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenization stage.

Python 101 11 Updated Sep 3, 2025

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Python 1,028 121 Updated Aug 7, 2024

Trainging, inference, and testing of the SAC speech codec model.

Python 91 6 Updated Nov 1, 2025
Python 28 2 Updated Nov 4, 2025

The demo page for ALMTokenizer

Python 55 3 Updated Apr 14, 2025

Official PyTorch Implementation of "Latent Diffusion Model Without Variational Autoencoder".

Python 378 13 Updated Dec 15, 2025

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

Python 405 28 Updated Nov 27, 2025

The best ChatGPT that $100 can buy.

Python 38,813 4,890 Updated Dec 9, 2025
JavaScript 2 Updated Oct 12, 2025

Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"

Python 102 4 Updated Oct 26, 2025
Jupyter Notebook 139 41 Updated Dec 15, 2025

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 9,384 1,245 Updated Dec 17, 2025

VoiceStar: Robust, Duration-controllable TTS that can Extrapolate

Python 303 26 Updated May 31, 2025

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,387 319 Updated Jun 21, 2025

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Dockerfile 96,537 10,726 Updated Dec 9, 2025

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 309 19 Updated Dec 11, 2025

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 6,963 601 Updated Jul 4, 2025

Reference implementation for DPO (Direct Preference Optimization)

Python 2,812 233 Updated Aug 11, 2024

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,555 2,838 Updated Dec 17, 2025
Next