This repo is currently under development. We're working hard to bring you a scalable and controllable method for generating unbounded and dynamic 3D driving scenes with high fidelity.
Please stay tuned for updates!
InfiniCube take advantage of recent advances in 3D and video generative models to achieve large dynamic scene generation with flexible controls like HD maps, vehicle bounding boxes, and text descriptions. First, we construct a map-conditioned 3D voxel generative model to unleash its power for unbounded voxel world generation. Then, we re-purpose a video model and ground it on the voxel world through a set of pixel-aligned guidance buffers, synthesizing a consistent appearance on long-video generation for large-scale scenes. Finally, we propose a fast feed-forward approach that employs both voxel and pixel branches to lift videos to dynamic 3D Gaussians with controllable objects.
- XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies.
- SCube: Instant Large-Scale Scene Reconstruction using VoxSplats.
- Cosmos-Drive-Dreams:Scalable Synthetic Driving Data Generation with World Foundation Models
@misc{lu2024infinicube,
title={InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models},
author={Yifan Lu and Xuanchi Ren and Jiawei Yang and Tianchang Shen and Zhangjie Wu and Jun Gao and Yue Wang and Siheng Chen and Mike Chen and Sanja Fidler and Jiahui Huang},
year={2024},
eprint={2412.03934},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.03934},
}