Stars
OpenSeeker: A search agent with open-source data and models
Helios: Real Real-Time Long Video Generation Model
Real-Time Physical Action-Conditioned Video Generation
Code release of [ICCV2025 Highlight] WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions
Train transformer language models with reinforcement learning.
NVIDIA FastGen: Fast Generation from Diffusion Models
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
[ICLR 2026] pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation
[ICLR 2026] rCM: SOTA JVP-Based Diffusion Distillation & Few-Step Video Generation & Scaling Up sCM/MeanFlow
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
SOTAMak1r / Infinite-Forcing
Forked from guandeh17/Self-ForcingInfinite-Forcing: Towards Infinite-Long Video Generation
Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"
[CVPR 2026] SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
Make self forcing endless. Add cache purging. Add prompt controllability.
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models