One-Policy-Fits-All

Geometry-Aware Action Latents for
Cross-Embodiment Manipulation

ICRA 2026

Juncheng Mu1,2,* , Sizhe Yang1,3,* , Hojin Bae2,* , Feiyu Jia1,4 ,
Qingwei Ben1,3 , Boyi Li5,† , Huazhe Xu2,† , Jiangmiao Pang1,†
1Shanghai AI Laboratory
2Tsinghua University
3The Chinese University of Hong Kong
4University of Science and Technology of China
5NVIDIA
*Equal contribution Corresponding author

Demo

Overview

One-Policy-Fits-All overview: breadth of manipulation tasks

We introduce One-Policy-Fits-All (OPFA), a general framework for cross-embodiment manipulation. OPFA leverages the geometric structures of diverse end-effectors to construct a unified latent action representation, and employs a unified latent retargeting decoder to recover embodiment-specific actions. This design enables seamless skill transfer across grippers and dexterous hands, offering a scalable solution to data scarcity and enabling rapid adaptation to new embodiments.

Method

One-Policy-Fits-All pipeline

The training pipeline of OPFA follows a two-stage paradigm. (1) We first construct a Geometry-Aware Latent Representation (GaLR) by encoding sampled reachable-state point clouds with 3D convolutions and geometric transformers for local/global feature extraction. A unified latent retargeting decoder then disentangles embodiment-specific actions from the latent space, enabling end-to-end training without manual annotations. (2) The pretrained encoder–decoder pair is integrated into any downstream policy (e.g., DP3), allowing cross-embodiment data to be jointly trained in a unified latent action space.

BibTeX


        @article{mu2026one,
          title={One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation},
          author={Mu, Juncheng and Yang, Sizhe and Bae, Hojin and Jia, Feiyu and Ben, Qingwei and Li, Boyi and Xu, Huazhe and Pang, Jiangmiao},
          journal={arXiv preprint arXiv:2603.14522},
          year={2026}
        }