This survey reviews state-of-the-art 3D and 4D world models - systems that learn, predict, and simulate the geometry and dynamics of real environments from multi-modal signals.
We unify terminology, scope, and evaluations, and organize the space into three complementary paradigms by representation:
For more details, kindly refer to our paper and project page. π
If you find this work helpful for your research, please kindly consider citing our paper:
@article{survey_3d_4d_world_models,
title = {3D and 4D World Modeling: A Survey},
author = {Lingdong Kong and Wesley Yang and Jianbiao Mei and Youquan Liu and Ao Liang and Dekai Zhu and Dongyue Lu and Wei Yin and Xiaotao Hu and Mingkai Jia and Junyuan Deng and Kaiwen Zhang and Yang Wu and Tianyi Yan and Shenyuan Gao and Song Wang and Linfeng Li and Liang Pan and Yong Liu and Jianke Zhu and Wei Tsang Ooi and Steven C. H. Hoi and Ziwei Liu},
journal = {arXiv preprint arXiv:2509.07996},
year = {2025},
}
- 0. Background
- 1. Benchmarks & Datasets
- 2. World Modeling from Video Generation
- 3. World Modeling from Occupancy Generation
- 4. World Modeling from LiDAR Generation
- 5. Applications
- 6. Other Resources
- 7. Acknowledgements
Unlike 2D projections, native 3D/4D signals directly encode metric geometry, visibility, and motion in the physical coordinates where agents act. Examples include:
- RGB-D imagery (2D images with depth channels)
- Occupancy grids (voxelized maps of free vs. occupied space)
- LiDAR point clouds (3D coordinates from active sensing)
- Neural fields (e.g., NeRF, Gaussian Splatting)
A 3D/4D world model is an internal representation that allows an agent to imagine, forecast, and interact with its environment in the 3D space.
Together, these models provide the foundation for simulation, planning, and embodied intelligence in complex environments.
WorldBench | VBench | WorldScore |
Theme | Venue | Date | Location | Recording |
---|---|---|---|---|
Workshop on World Modeling | - | February 4-6, 2026 | MontrΓ©al | - |
Workshop on Embodied World Models for Decision Making | NeurIPS 2025 | December 6, 2025 | San Diego | - |
Workshop on Reliable and Interactable World Models: Geometry, Physics, Interactivity and Real-World Generalization | ICCV 2025 | October 19, 2025 | Hawai'i | - |
Workshop on Building Physically Plausible World Models | ICML 2025 | July 19, 2025 | Vancouver | - |
Workshop on Assessing World Models | ICML 2025 | July 18, 2025 | Vancouver | - |
Workshop on Benchmarking World Models | CVPR 2025 | June 12, 2025 | Nashville | - |
Workshop on World Models: Understanding, Modelling and Scaling | ICLR 2025 | April 28, 2025 | Singapore | - |
Workshop on Foundation Models for Autonomous Systems | CVPR 2024 | June 17, 2025 | Seattle | [YouTube] |
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.
β²οΈ In chronological order, from the earliest to the latest.