Highlights
- Pro
Stars
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
Bash is all you need - A nano claude codeβlike γagent harnessγ, built from 0 to 1
Official code for "LagerNVS Latent Geometry for Fully Neural Real-time Novel View Synthesis" (CVPR 2026)
[CVPR2026] TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering
A minimal CLI tool to organize, load, and switch between your project's environment contexts.
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model
[RSS 2026] Causal video-action world model for generalist robot control
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
[AAAI 2026 Oral] SpatialActor: Exploring Disentangled Spatial Representations for Robust Robotic Manipulation
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
Official code of Motus: A Unified Latent Action World Model
Ego-Vision World Model for Humanoid Contact Planning
Humanoid dataset for learning
[ICLR 2026] Towards Unified Latent VLA for Whole-body Loco-manipulation Control
π₯ The first open-sourced diffusion vision-langauge-action model. [ICLR 2026]
[CVPR 2026] Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight
RoboBrain 2.5: Advanced version of RoboBrain. Depth in Sight, Time in Mind. πππ
[CVPR 2025] RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete. Official Repository.
[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning