-
Shanghai Innovation Institute
- artpli.github.io
- @artpli_
Stars
GR00T-VisualSim2Real: Open-source sim-to-real framework for humanoid visual loco-manipulation. Train in simulation, deploy zero-shot on real robots with RGB + proprioception for tasks like pick-and…
H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation
[CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision
HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos
egocentric humanoid manipulation benchmark
Open-source Unitree G1 Vision-Language-Action stack for teleop data collection, SonicLatent training, simulation, and real-time whole-body policy deployment(real world deployment TBD).
🎥 Python and OpenCV-based scene cut/transition detection program & library.
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
ROSA 🤖 is an AI Agent designed to interact with ROS1- and ROS2-based robotics systems using natural language queries. ROSA helps robot developers inspect, diagnose, understand, and operate robots.
[RSS 2026] The first framework enabling humanoid robots to learn whole-body loco-manipulation from egocentric human demos
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels
GigaWorld-Policy: An Efficient Action-Centered World–Action Model
This is the official code repo for DiT4DiT, a Vision-Action-Model (VAM) framework that combines video generation model with flow-matching-based action prediction for generalizable robotic manipulat…
Welcome to SIMPLE, a full-stack simulation environment for humanoid loco-manipulation, built on AMO/SONIC, with integrated support for mainstream VLAs such as Psi0, Pi05, GR00T, DreamZero and more.
Sentdex / kimolab
Forked from mujocolab/mjlabIsaac Lab API, powered by MuJoCo-Warp, for RL and robotics research.
Replication of mimic-video: Video-Action Models for Generalizable Robot Control Beyond VLAs
SOMA BVH to humanoid robot motion retargeting library built with Newton and NVIDIA Warp
[RSS 2024] Consistency Policy: Accelerated Visuomotor Policies via Consistency Distillation
Dexbotic: Open-Source Vision-Language-Action Toolbox
One framework to evaluate any VLA model on any robot simulation benchmark.
Open-source framework for the research and development of foundation models.
An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.
A real-time video understanding foundation model with gated cross-attention. Offline & real-time inference.
A Super AI Lab with massive AI Doctors as Assistants. Best IDE for Research via AI Power.
We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference…
This website is for the collection of VLA SOTA results.