Stars
[CVPR 2026] Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching
[NeurIPS 2024] Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Building General-Purpose Robots Based on Embodied Foundation Model
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
[CoRL 2024] Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
The hub for EleutherAI's work on interpretability and learning dynamics
Official implementation of ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver.
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
This is the repo of CoRL 2024 paper "Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning"
Code for "Novel Object 6D Pose Estimation with a Single Reference View".
[CVPR 2025] Any6D: Model-free 6D Pose Estimation of Novel Objects
🔥 SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes. Accepted at RSS 2025.
[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents
GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization
[AAAI'26 Oral] DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping
HaMeR: Reconstructing Hands in 3D with Transformers
HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos, CVPR 2025
CVPR2020. HOnnotate: A method for 3D Annotation of Hand and Object Poses
FoundPose: Unseen Object Pose Estimation with Foundation Features, ECCV 2024
This module provides functions for point cloud registration using Open3D. It includes functions for preprocessing point clouds, executing global registration, refining registration using ICP, and p…
Point Pair Feature-Based Pose Estimation with Multiple Edge Appearance Models (PPF-MEAM) for Robotic Bin Picking