DexVLG

DexVLG: Dexterous Vision-Language-Grasp Model at Scale (ICCV 2025 Spotlight)

We introduce DexVLG, a vision‑language‑grasp model trained on a large-scale synthetic dataset that can generate instruction‑aligned dexterous grasp poses and achieves SOTA success and part‑grasp accuracy.

DexGraspNet3.0, a large-scale dataset containing 170M part-aligned dexterous grasp poses on 174k objects, each annotated with semantic captions.
DexVLG, a vision‑language model to generate language-instructed dexterous grasp poses in an end-to-end way.
We curate benchmarks and conduct extensive experiments to evaluate DexVLG in simulation and the real world.

TODO List:

Release the training/inference code of DexVLG
Release model weights

Citing DexVLG

@article{dexvlg25,
      title={DexVLG: Dexterous Vision-Language-Grasp Model at Scale},
      author={He, Jiawei and Li, Danshi and Yu, Xinqiang and Qi, Zekun and Zhang, Wenyao and Chen, Jiayi and Zhang, Zhaoxiang and Zhang, Zhizheng and Yi, Li and Wang, He},
      journal={arXiv preprint arXiv:2507.02747},
      year={2025}
    }

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DexVLG

Citing DexVLG

About

Uh oh!

Releases

Packages

jiaweihe1996/DexVLG

Folders and files

Latest commit

History

Repository files navigation

DexVLG

Citing DexVLG

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages