Skip to content
View RayeRen's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@msra-alumni @MLNLP-World @NATSpeech

Block or report RayeRen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
833 results for source starred repositories
Clear filter

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 18,472 2,277 Updated Dec 2, 2025

Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.

Python 3,510 462 Updated Jan 29, 2026

Pre-built wheels that erase Flash Attention 3 installation headaches.

Python 50 Updated Feb 2, 2026

Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…

Rust 6,009 540 Updated Feb 4, 2026

HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation​

Python 672 54 Updated Oct 14, 2025

MiniCPM-o 4.5: A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Mulitmodal Live Streaming on Your Phone

Python 22,849 1,734 Updated Feb 5, 2026

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,605 142 Updated Feb 5, 2026

CUDA Python: Performance meets Productivity

Cython 3,156 243 Updated Feb 5, 2026

Open-source unified multimodal model

Python 5,632 498 Updated Oct 27, 2025

Text-audio foundation model from Boson AI

Python 7,900 601 Updated Jan 18, 2026

Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP

Python 93 8 Updated Aug 20, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,134 332 Updated Jan 17, 2026

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 77,792 9,207 Updated Feb 5, 2026
Python 221 17 Updated Mar 21, 2023
Python 24 Updated May 28, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,957 123 Updated Nov 4, 2025

ACE-Step: A Step Towards Music Generation Foundation Model

Python 3,856 477 Updated Jan 28, 2026

[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Python 362 22 Updated Aug 11, 2025

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,478 337 Updated Jun 21, 2025

The uncompromising Python code formatter

Python 41,349 2,718 Updated Jan 31, 2026
Python 6,068 469 Updated Aug 29, 2025

The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)

Python 80 5 Updated Apr 23, 2025

MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code

Python 811 105 Updated Oct 16, 2024

GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code

Python 1,802 258 Updated Oct 18, 2024

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 7,554 762 Updated Jan 22, 2025

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 783 26 Updated Oct 13, 2025

High-resolution models for human tasks.

Python 5,277 315 Updated Nov 18, 2024

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,396 1,251 Updated Nov 4, 2025

Official inference repo for FLUX.1 models

Python 25,187 1,851 Updated Jul 31, 2025
Next