Stars
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
CUDA Python: Performance meets Productivity
Text-audio foundation model from Boson AI
Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
ACE-Step: A Step Towards Music Generation Foundation Model
[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code
GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
High-resolution models for human tasks.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Official inference repo for FLUX.1 models
[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"
Stable Video Diffusion Training Code and Extensions.