Official implementation for paper "Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections", by Xiaomeng Xu*, Yifan Hou*, Chendong Xin, Zeyi Liu, and Shuran Song.
The following is tested on Ubuntu 24.04.
- Install mamba
- Clone this repo.
git clone git@github.com:yifan-hou/cr-dagger.git- Create a virtual env called
pyrite:
cd PyriteML
# Note 1: If env create gets stuck, you can create an empty environment, then install pytorch/torchvision/torchaudio following official pytorch installation instructions, then install the rest via mamba.
# Note 2: zarr 3 changed many interfaces and does not work for PyriteML. We recommend to use zarr 2.18
mamba env create -f conda_environment.yaml
# after finish, activate it using
mamba activate pyrite
# a few pip installs
pip install robotmq # for cross-process communication
pip install v4l2py
pip install toppra
pip install atomics
pip install vit-pytorch # Need at least 1.7.12, which was not available in conda
pip install imagecodecs # Need at least 2023.9.18, which caused lots of conflicts in conda
# Install local packages
cd PyriteUtilities
pip install -e .- Setup environment variables: add the following to your .bashrc or .zshrc, edit according to your local path.
# where the collected raw data folders are
export PYRITE_RAW_DATASET_FOLDERS=$HOME/data/real
# where the post-processed data folders are
export PYRITE_DATASET_FOLDERS=$HOME/data/real_processed
# Each training session will create a folder here.
export PYRITE_CHECKPOINT_FOLDERS=$HOME/training_outputs
# Hardware configs.
export PYRITE_HARDWARE_CONFIG_FOLDERS=$HOME/hardware_interfaces/workcell/ur_test_bench/config
# Logging folder.
export PYRITE_CONTROL_LOG_FOLDERS=$HOME/data/control_logWe provide an admittance controller implementation based on our force_control package.
Make sure the conda packages are visible to c++ linkers. Create a .sh file with the following content:
# clib_path_activate.sh
export LD_LIBRARY_PATH=/home/yifanhou/miniforge3/envs/pyrite/lib/:$LD_LIBRARY_PATHat ${CONDA_PREFIX}/etc/conda/activate.d/, e.g. $HOME/miniforge3/envs/pyrite/etc/conda/activate.d if you install miniforge at the default location.
Pull the following packages:
# https://github.com/yifan-hou/cpplibrary
git clone git@github.com:yifan-hou/cpplibrary.git
# https://github.com/yifan-hou/force_control important: use the 'tracking' branch
git clone -b tracking git@github.com:yifan-hou/force_control.git
# https://github.com/yifan-hou/hardware_interfaces important: use the 'tracking' branch
git clone -b tracking git@github.com:yifan-hou/hardware_interfaces.gitThen build & install following their readme.
After building the hardware_interfaces package, a pybind library is generated under hardware_interfaces/workcell/table_top_manip/python/. This library contains a c++ multi-thread server that maintains low-latency communication and data/command buffers with all involved hardware. It also maintains an admittance controller. We will launch a python script (the actor) that communicates with the hardware server, while the python script itself does not need multi-processing.
I recommend to install to a local path for easy maintainence, also you don't need sudo access. To do so, replace the line
cmake ..with
cmake -DCMAKE_INSTALL_PREFIX=$HOME/.local ..when building packages above. Here $HOME/.local can be replaced with any local path.
Then you need to tell gcc to look for binaries/headers from your local path by adding the following to your .bashrc or .zshrc:
export PATH=$HOME/.local/bin:$PATH
export C_INCLUDE_PATH=$HOME/.local/include/:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=$HOME/.local/include/:$CPLUS_INCLUDE_PATH
export LD_LIBRARY_PATH=$HOME/.local/lib/:$LD_LIBRARY_PATHYou need to run source .bashrc or reopen a terminal for those to take effect.
There are two main components when running CR-DAgger:
The Actor: A python process that maintains the policy inference loops. It has three functionalities:
- It maintains communication with the hardware. The actor loads the base policy and optionally the residual policy, reads robot feedback from the hardware, and sends out policy outputs to the hardware. When launching the actor, the low-level controllers (hardware drivers, admittance controllers) are automatically instantiated and runs in the background.
- Send data to learner. The actor can send the collected time-stamped real data, including human correction motion data, to the Learner for processing and training.
- Receive residual policy weights. The actor checks whether new policy weights are available at the beginning of every base policy loop.
The Learner: A python process that maintains data processing and policy training. It listens to the actor for available data, and sends back updated residual policy weights.
A few useful features of this setup:
- The actor and the learner could be running on different machines and communicate via network, thanks to the robot-message-queue package. This means that you can use one desktop dedicated to online inference, while performing training/data processing on a server, as long as the two machines can ping each other.
- Both actor and learner reads configurations from the single config file, reducing chance of mistake.
PyriteML/online_learning/configs/config_v1.py
This is the top level config file shared across learner and actor. It controls actor and learner behavior, and their communication. Actor and learner code will load this file by importing this python file. This file contains paths to the other two config files.
hardware_interfaces/workcell/table_top_manip/config/
Configs specific to your robot hardware, such as robot ip, calibrations. Note that the robot stiffness parameter under admittance_controller will be overriden by the actor node, specified in DAgger config above.
PyriteML/diffusion_policy/config/train_online_residual_mlp_workspace.yaml and PyriteML/diffusion_policy/config/task/*
Policy training workspace and task configs for the residual policy, following code structure from UMI. Note that you need to specify the path to the base policy checkpoint in the residual workspace config using the base_policy_ckpt field.
Before launching, check the following:
pyritevirtual environment is activated.- Env variables
PYRITE_CHECKPOINT_FOLDERS,PYRITE_HARDWARE_CONFIG_FOLDERS,PYRITE_CONTROL_LOG_FOLDERSare properly set.
Launch the learner
python cr-dagger/PyriteML/online_learning/learners/residual_learner_v1.pyLaunch the actor
python cr-dagger/PyriteEnvSuites/env_runners/residual_online_learning_env_runner.py- Set
residual_ckpt_pathtoNonein (a) DAgger config - Set
send_transitions_to_servertoTruein (a) DAgger config - If you don't want to start training, set
num_episodes_before_first_trainingto a large number in (a) DAgger config - Make sure the
base_policy_ckptpoints to the correct base policy in (c) training config - Launch actor and learner in separate terminals. Follow on-screen instructions of the actor.
- Set
residual_ckpt_pathto the correct path in (a) DAgger config - Same as I.
- Make sure
data_folder_pathin (a) DAgger config points to the folder with existing data. - Same as above.
Learner will repeatedly 1. process all available new data, 2. launch training, 3. send updated weights to actor.
- set
num_episodes_before_first_trainingto a suitable number in (a) DAgger config. - Same as I or II.
- Make sure
data_folder_pathin (a) DAgger config points to the folder with existing data. - set
num_episodes_before_first_trainingto a number lower than the number of existing episodes in (a) DAgger config. - Launch learner.
- Set
residual_ckpt_pathto the correct path in (a) DAgger config - Set
send_transitions_to_servertoFalsein (a) DAgger config - Set
num_episodes_before_first_trainingto a large number in (a) DAgger config - Make sure the
base_policy_ckptpoints to the correct base policy in (c) training config - Launch actor and learner.
For data collection and training of the base policy, please refer to base_policy.md.
If you found this repo useful, you can acknowledge us by citing our paper:
@inproceedings{CRDagger2025,
title={Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections},
author={Xiaomeng Xu* and Yifan Hou* and Chendong Xin and Zeyi Liu and Shuran Song},
year={2025},
booktitle={The 39th Annual Conference on Neural Information Processing Systems (NeurIPS)},
}
This repository is released under the MIT license.
This work was supported in part by the NSF Award #2143601, #2037101, and #2132519, Sloan Fellowship, Toyota Research Institute, and Samsung. We would like to thank Google and TRI for the UR5 robot hardware. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the sponsors.
Special thanks to @yihuai-gao for lots of customization of the robot-message-queue package for this project.
- This code base is built on top of adaptive-compliance-policy
- The cross-process architecture is built using robot-message-queue.
- The training pipeline is modified from universal_manipulation_interface
- We include a copy of multimodal_representation for the temporal convolution encoding for forces.