Stars
Official implementation of Learning Athletic Humanoid Tennis Skills from Imperfect Human Motion Data
CLI-Anything: Making ALL Software Agent-Native
Access 175+ robot descriptions from the main Python robotics frameworks
A lightweight, AI-native training framework for large language models. Designed for fast iteration, reproducible experiments, and modular configuration across SFT, RLVR, and evaluation workflows.
[ICLR 2026] Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model
Fast, Sharp & Reliable Agentic Intelligence
Step3-VL-10B: A compact yet frontier multimodal model achieving SOTA performance at the 10B scale, matching open-source models 10-20x its size.
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
HY-Motion model for 3D human motion or 3D character animation generation.
Native and Compact Structured Latents for 3D Generation
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
[CVPR 2026] G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
Code for "PHUMA: Physically-Grounded Humanoid Locomotion Dataset"
[Neurips DB 2025] PartNeXt: A Next-Generation Dataset for Fine-Grained and Hierarchical 3D Part Understanding
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
Reasoning in Space via Grounding in the World (ICLR 2025)
Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research.
Official implementation of OpenTrack.
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
A Paper List for Humanoid Robot Learning.
A curated list of behavior(al) foundation model (BFM) papers, articles, tutorials, slides and projects
[ICCV 2025] Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views