video.mp4
Please follow the instructions to install the conda environments, as well as the real robot environments. We recommend using CUDA 11.8 during installations to avoid compatibility issues.
Also, remember to adjust the Hand-Eye-Calibration in eval_ros_Pose.py and eval_ros_RPDP.py according to your own environment.
-
Create a new conda environment and activate the environment.
conda create -n PoseInsert python=3.8 conda activate PoseInsert
-
Manually install cudatoolkit, then install necessary dependencies.
pip install -r requirements.txt
First of all, we use the Cobot Mobile ALOHA, manufactured by agilex.ai. Please calibrate the camera with the robot before data collection and evaluation to ensure correct spatial transformations between camera and the robot.
- Place the
collect_data/ros_pose_gripper.pyto FoundationPose. Place the gripper in front of camera.python ros_pose_gripper.py # publish the gripper pose python calibation/calibation_fk.py # get the camera_in_base
Human demonstrations is collected.
-
Place the
collect_data/ros_pose_source2.pyandcollect_data/ros_pose_target2.pyto FoundationPose. And get the source/target object pose.python ros_pose_source2.py # publish the source object pose python ros_pose_target2.py # publish the target object pose
-
Collect the train data. Remember to adjust the
save_dirincollect_pose.py.python collect_pose.py --idx 0
-
Show the train data.
python replay_with_workspace.py
-
Show the train trajectories.
python vis_traj.py
-
Get the workspace for normalization.
python get_workspace.py
-
Train the model. Remember to adjust the
data_pathandckpt_dir.python train_pose.py python train_RPDP.py
-
Test. Remember to adjust the
ckpt.python eval_pose.py python eval_RPDP.py
-
Test in the real-world. Before start the policy, the robot should grasp the source object.
python device/robot_bringup.py # start up the robot python eval_ros_Pose.py python eval_ros_RPDP.py
The PoseInsert policy consists of (1) a pose encoder (policy/cnn.py), (2) a RGBD encoder (policy/cnn.py), (3) a Pose-Guided Residual Gated Fusion (policy/cnn.py) and (4) a diffusion action decoder (policy/diffusion.py).
- Our diffusion module is adapted from RISE.
- Our RGBD encoder is adapted from FoundationPose. .
@article{sun2025exploringposeguidedimitationlearning,
title = {Exploring Pose-Guided Imitation Learning for Robotic Precise Insertion},
author = {Han Sun and Yizhao Wang and Zhenning Zhou and Shuai Wang and Haibo Yang and Jingyuan Sun and Qixin Cao},
journal = {arXiv preprint arXiv:2404.12281},
year = {2025}
}PoseInsert (including data and codebase) by
is licensed under CC BY-NC-SA 4.0