Stars
NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
[RSS 2026] LDA-1B: Scaling Latent Dynamics Action Model via Universal Embodied Data Ingestion
[browser-agent] Never send a human to do a machine's job.
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
[ICLR 2026] The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…
Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals
[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation" & Causal Forcing++
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Benchmarking Knowledge Transfer in Lifelong Robot Learning
[Lumina具身智能社区] 具身智能技术指南 Embodied-AI-Guide
Open-source implementation of AlphaEvolve
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
A Foundation Model for Generalist Gaming Agents
Native and Compact Structured Latents for 3D Generation
[ICLR 2026] Official Repo For "BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration"
[CVPR' 2026] JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization
[CVPR 2026] Official Implementation of Particulate: Feed-Forward 3D Object Articulation
[CVPR 2026] Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models.