- Graduate Student | Xiamen University (Sept 2023 - Present)
- B.S. Artificial Intelligence | Nanchang University (Sept 2019 - June 2023)
- Research Intern | Baidu Inc. (Aug 2025 - Present) - Video Generation Research
- Research Assistant | Texas A&M University (May 2025 - Aug 2025) - 3D Vision & Embodied Intelligence
- Research Assistant | VITA Group, University of Texas at Austin (Jan 2024 - May 2025) - 3D Spatial Reconstruction & Understanding
ArXiv 2025 | Jian Zhang*, Zhiwen Fan*, et al.
Unified VLM framework incorporating 3D Reconstructive instruction tuning, processing monocular video to derive implicit 3D tokens for spatial assistance and embodied reasoning.
Preprint | Kairun Wen*, Yuzhi Huang*, ..., Jian Zhang, et al.
Large-scale dataset with 100K+ videos, 800K+ masks, and 10M+ frames for understanding dynamic physical worlds with evolving 3D structure and motion.
NeurIPS 2024 | Jian Zhang*, Zhiwen Fan*, et al.
First real-time semantic 3D reconstruction system that directly processes unposed RGB images into semantic radiance fields in a single feed-forward pass.
ArXiv 2024 | Zhiwen Fan*, Kairun Wen*, ..., Jian Zhang, et al.
Lightning-fast sparse-view 3D scene reconstruction using self-supervised framework that optimizes 3D scene representation and camera poses simultaneously.
|
3D-Consistent Video Generation
Creating spatially coherent visual content |
3D Spatial Understanding
Developing comprehensive 3D perception |
Research Collaborations
Building the future of 3D AI together |
Particularly interested in opportunities that bridge cutting-edge research with real-world applications.