Stars
A comprehensive benchmark specifically designed to evaluate the interactive response capabilities of world models in 4D settings.
[2026 CVPR]Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
PyTorch re-implementation for MeanFlow
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…
MobilityBench: A Scalable Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
IntTravel: A Real-World Dataset and Generative Framework for Integrated Multi-Task Travel Recommendation
Code2World: A GUI World Model via Renderable Code Generation
Official repository for “PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss”
[ICLR2026] Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models
[ICLR 2026] Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
Official implementation of the ICLR 2026 paper "Urban Socio-Semantic Segmentation with Vision-Language Reasoning"
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization
[CVPR 2026 Findings] Eevee: Towards Close-up High-resolution Video-based Virtual Try-on
Processed / Cleaned Data for Paper Copilot
[EMNLP25] Official code for "POSITION BIAS MITIGATES POSITION BIAS: Mitigate Position Bias Through Inter-Position Knowledge Distillation"
[AAAI2026] ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
[ICLR2026] Advancing End-To-End Pixel-Space Generative Modeling Via Self-Supervised Pre-Training
[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning
[Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges
[ICLR2026] Implementation of "S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models"
[AAAI2026] Implementation Code for Omni-Effects