Skip to content
View RayeRen's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@msra-alumni @MLNLP-World @NATSpeech

Block or report RayeRen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…

Rust 5,840 501 Updated Dec 20, 2025

HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation​

Python 663 52 Updated Oct 14, 2025

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,425 1,689 Updated Sep 24, 2025

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,441 121 Updated Dec 20, 2025

CUDA Python: Performance meets Productivity

Cython 3,100 234 Updated Dec 19, 2025

Open-source unified multimodal model

Python 5,489 481 Updated Oct 27, 2025

Text-audio foundation model from Boson AI

Python 7,754 577 Updated Sep 15, 2025

Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP

Python 91 7 Updated Aug 20, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,876 289 Updated Dec 11, 2025

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 73,974 8,853 Updated Dec 20, 2025
Python 219 18 Updated Mar 21, 2023
Python 23 Updated May 28, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,761 104 Updated Nov 4, 2025

ACE-Step: A Step Towards Music Generation Foundation Model

Python 3,479 420 Updated Jun 27, 2025

[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Python 347 22 Updated Aug 11, 2025

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,393 319 Updated Jun 21, 2025

The uncompromising Python code formatter

Python 41,236 2,692 Updated Dec 19, 2025
Python 6,052 466 Updated Aug 29, 2025

The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)

Python 80 5 Updated Apr 23, 2025

MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code

Python 801 100 Updated Oct 16, 2024

GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code

Python 1,795 257 Updated Oct 18, 2024

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 7,261 726 Updated Jan 22, 2025

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 770 26 Updated Oct 13, 2025

High-resolution models for human tasks.

Python 5,251 309 Updated Nov 18, 2024

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,259 1,233 Updated Nov 4, 2025

Official inference repo for FLUX.1 models

Python 24,934 1,829 Updated Jul 31, 2025

Bring portraits to life!

Python 17,484 1,816 Updated Nov 16, 2025

[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"

Python 768 38 Updated Aug 16, 2024

Stable Video Diffusion Training Code and Extensions.

Python 723 74 Updated Jul 25, 2024
Next