Skip to content

x-robotics-lab/dexscrew

Repository files navigation

Learning Dexterous Manipulation Skills from Imperfect Simulations

Project Page | arXiv

Elvis Hsieh*, Wen-Han Hsieh*, Yen-Jen Wang*, Toru Lin, Jitendra Malik, Koushil Sreenath† Haozhi Qi†,
∗: Equal contribution (listed in alphabetical order). †: Equal advising.

Demo

Installation

See installation instructions.

Introduction

Our method contains the following four steps.

  1. Learn a oracle policy with privileged information and point-clouds with RL in simulation.
  2. Learn a padapt-based student policy using the oracle policy in simulation.
  3. Using the trained rotation policy as motion prior, we leverage teleoperation to collect trajectories with downward motion and tactile sensing in real world.
  4. Train a behavior cloning policy with expert trajectories to fuse extra observations.

The following session only provides example script of our method. For baselines, checkout baselines.

Step 1: Oracle Policy Training

To train an oracle policy $f$ with RL, run

# 0 is GPU is
# 42 is experiment seed
scripts/screwdriver_teacher.sh 0 42 output_name

After training your oracle policy, you can visualize it as follows:

scripts/vis_screwdriver_teacher.sh 0 42 ckpt_name

Step 2: Sensorimotor policy Training

In this section, we train a sensorimotor policy by distilling from our trained oracle policy $f$.

Note we use the proprioceptive adapt to train the sensorimotor policy.

scripts/screwdriver_student_padapt.sh 0 42 output_name

Step 3: Rotational Policy deployment in Real Hardware

To generate the rotational policy from the student policy $\pi$, run

scripts/convert_student_jit.sh

To deploy rotational policy on real hardware, please refer to ./xhand-deploy.

Step 4: Real-world Fine-tuning

See the following repository: skill-teleop

Acknowledgement

This repository is built based on penspin, Hora and IsaacGymEnvs, and supported in part by the program "Design of Robustly Implementable Autonomous and Intelligent Machines (TIAMAT)", Defense Advanced Research Projects Agency award number HR00112490425. We thank Mengda Xu for his valuable feedback.

Citation

If you find dexscrew or this codebase helpful in your research, please cite:

@article{hsieh2025learning,
  title={Learning Dexterous Manipulation Skills from Imperfect Simulations},
  author={Hsieh, Elvis and Hsieh, Wen-Han and Wang, Yen-Jen and Lin, Toru and Malik, Jitendra and Sreenath, Koushil and Qi, Haozhi},
  journal={arXiv:2512.02011},
  year={2025}
}