Results for pose-grounded generative 3D reconstruction from a single test image. Given an input image (top left), CUPID estimates camera pose (bottom left) and reconstructs a 3D model (bottom right), re-rendering the input (top right). It is robust to changes in scale, placement, and lighting while preserving fine texture, and supports component-aligned scene reconstruction (bottom row). All results are produced in seconds via feed-forward sampling of the learned model. See cupid3d.github.io for an immersive view of the interactive 3D results.
The code is currently under development and is expected to be released by January 2026. For progress updates and the official release, please check the project website or the GitHub repository.
If you want to cite our work, please use:
@article{huang2025cupid,
title={CUPID: Pose-Grounded Generative 3D Reconstruction from a Single Image},
author={Huang, Binbin and Duan, Haobin and Zhao, Yiqun and Zhao, Zibo and Ma, Yi and Gao, Shenghua},
journal={arXiv preprint arXiv:2510.20776},
year={2025}
}