Stars
PyTorch Lightning Optical Flow models, scripts, and pretrained weights.
[ICLR2026 - Oral] WAFT: Warping-Alone Field Transforms for Optical Flow
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
One framework to evaluate any VLA model on any robot simulation benchmark.
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
[ICLR 2026 Oral] Latent Particle World Models official repository
[ICLR 2026] RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation
Causal video-action world model for generalist robot control
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
REALM: A Real-to-Sim Validated Benchmark for Generalization in Robotic Manipulation
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling
Official inference repo for FLUX.1 models
A optimized PyTorch framework for behavior cloning with flow related generative models.
Team Comet's 2025 BEHAVIOR Challenge Codebase
Distribution Matching Variational AutoEncoder (DMVAE)
DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space
BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx
Official Implementations for Paper - MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues
[CVPR'2026] "MM-ACT: Learn from Multimodal Parallel Generation to Act"
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies
SHAILAB-IPEC / EO1
Forked from EO-Robotics/EO1EO: Open-source Unified Embodied Foundation Model Series