Learning Dexterous Manipulation Skills from Imperfect Simulations

Elvis Hsieh*, Wen-Han Hsieh*, Yen-Jen Wang*, Toru Lin, Jitendra Malik, Koushil Sreenath† Haozhi Qi†,
∗: Equal contribution (listed in alphabetical order). †: Equal advising.

Installation

See installation instructions.

Introduction

Our method contains the following four steps.

Learn a oracle policy with privileged information and point-clouds with RL in simulation.
Learn a padapt-based student policy using the oracle policy in simulation.
Using the trained rotation policy as motion prior, we leverage teleoperation to collect trajectories with downward motion and tactile sensing in real world.
Train a behavior cloning policy with expert trajectories to fuse extra observations.

The following session only provides example script of our method. For baselines, checkout baselines.

Step 1: Oracle Policy Training

To train an oracle policy $f$ with RL, run

# 0 is GPU is
# 42 is experiment seed
scripts/screwdriver_teacher.sh 0 42 output_name

After training your oracle policy, you can visualize it as follows:

scripts/vis_screwdriver_teacher.sh 0 42 ckpt_name

Step 2: Sensorimotor policy Training

In this section, we train a sensorimotor policy by distilling from our trained oracle policy $f$.

Note we use the proprioceptive adapt to train the sensorimotor policy.

scripts/screwdriver_student_padapt.sh 0 42 output_name

Step 3: Rotational Policy deployment in Real Hardware

To generate the rotational policy from the student policy $\pi$, run

scripts/convert_student_jit.sh

To deploy rotational policy on real hardware, please refer to ./xhand-deploy.

Step 4: Real-world Fine-tuning

See the following repository: skill-teleop

Acknowledgement

This repository is built based on penspin, Hora and IsaacGymEnvs, and supported in part by the program "Design of Robustly Implementable Autonomous and Intelligent Machines (TIAMAT)", Defense Advanced Research Projects Agency award number HR00112490425. We thank Mengda Xu for his valuable feedback.

Citation

If you find dexscrew or this codebase helpful in your research, please cite:

@article{hsieh2025learning,
  title={Learning Dexterous Manipulation Skills from Imperfect Simulations},
  author={Hsieh, Elvis and Hsieh, Wen-Han and Wang, Yen-Jen and Lin, Toru and Malik, Jitendra and Sreenath, Koushil and Qi, Haozhi},
  journal={arXiv:2512.02011},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
configs		configs
dexscrew		dexscrew
docs		docs
scripts		scripts
xhand-deploy		xhand-deploy
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
student_eval.py		student_eval.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Learning Dexterous Manipulation Skills from Imperfect Simulations

Installation

Introduction

Step 1: Oracle Policy Training

Step 2: Sensorimotor policy Training

Step 3: Rotational Policy deployment in Real Hardware

Step 4: Real-world Fine-tuning

Acknowledgement

Citation

About

Uh oh!

Contributors 2

Languages

License

x-robotics-lab/dexscrew

Folders and files

Latest commit

History

Repository files navigation

Learning Dexterous Manipulation Skills from Imperfect Simulations

Installation

Introduction

Step 1: Oracle Policy Training

Step 2: Sensorimotor policy Training

Step 3: Rotational Policy deployment in Real Hardware

Step 4: Real-world Fine-tuning

Acknowledgement

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Languages