PAPER_TITLE

FIRST_AUTHOR_LAST, FIRST_AUTHOR_FIRST; SECOND_AUTHOR_LAST, SECOND_AUTHOR_FIRST

One-Policy-Fits-All

Geometry-Aware Action Latents for
Cross-Embodiment Manipulation

ICRA 2026

Juncheng Mu^1,2,*

, Sizhe Yang^1,3,*

, Hojin Bae^2,*

, Feiyu Jia^1,4

,
Qingwei Ben^1,3

, Boyi Li^5,†

, Huazhe Xu^2,†

, Jiangmiao Pang^1,†

¹Shanghai AI Laboratory

²Tsinghua University

³The Chinese University of Hong Kong

⁴University of Science and Technology of China

⁵NVIDIA

^*Equal contribution ^†Corresponding author

Paper Code arXiv

Demo

Overview

We introduce One-Policy-Fits-All (OPFA), a general framework for cross-embodiment manipulation. OPFA leverages the geometric structures of diverse end-effectors to construct a unified latent action representation, and employs a unified latent retargeting decoder to recover embodiment-specific actions. This design enables seamless skill transfer across grippers and dexterous hands, offering a scalable solution to data scarcity and enabling rapid adaptation to new embodiments.

Method

The training pipeline of OPFA follows a two-stage paradigm. (1) We first construct a Geometry-Aware Latent Representation (GaLR) by encoding sampled reachable-state point clouds with 3D convolutions and geometric transformers for local/global feature extraction. A unified latent retargeting decoder then disentangles embodiment-specific actions from the latent space, enabling end-to-end training without manual annotations. (2) The pretrained encoder–decoder pair is integrated into any downstream policy (e.g., DP3), allowing cross-embodiment data to be jointly trained in a unified latent action space.

BibTeX


        @article{mu2026one,
          title={One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation},
          author={Mu, Juncheng and Yang, Sizhe and Bae, Hojin and Jia, Feiyu and Ben, Qingwei and Li, Boyi and Xu, Huazhe and Pang, Jiangmiao},
          journal={arXiv preprint arXiv:2603.14522},
          year={2026}
        }