The official code for our paper: Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting.
This work has been accepted by CVPR 2025 π
Jingyi Xu#, Xieyuanli Chen#, Junyi Ma, Jiawei Huang, Jintao Xu, Yue Wang, Ling Pei*.
If you use EfficientOCF in an academic work, please cite our paper:
@inproceedings{xu2025cvpr,
author = {Jingyi Xu and Xieyuanli Chen and Junyi Ma and Jiawei Huang and Jintao Xu and Yue Wang and Ling Pei},
title = {{Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting}},
booktitle = {Proc.~of the IEEE/CVF Conf.~on Computer Vision and Pattern Recognition (CVPR)},
year = 2025
}
We follow the installation instructions of our codebase Cam4DOcc, which are also posted here
- Create a conda virtual environment and activate it
conda create -n efficientocf python=3.7 -y
conda activate efficientocf- Install PyTorch and torchvision (tested on torch==1.10.1 & cuda=11.3)
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge- Install gcc>=5 in conda env
conda install -c omgarcia gcc-6- Install mmcv, mmdet, and mmseg
pip install mmcv-full==1.4.0
pip install mmdet==2.14.0
pip install mmsegmentation==0.14.1- Install mmdet3d from the source code
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
git checkout v0.17.1 # Other versions may not be compatible.
python setup.py install- Install other dependencies
pip install timm
pip install open3d-python
pip install PyMCubes
pip install spconv-cu113
pip install fvcore
pip install setuptools==59.5.0
pip install lyft_dataset_sdk # for lyft dataset- Install occupancy pooling
git clone git@github.com:BIT-XJY/EfficientOCF.git
cd EfficientOCF
- Please link your nuScenes V1.0 full dataset to the data folder.
- nuScenes-Occupancy, nuscenes_occ_infos_train.pkl, and nuscenes_occ_infos_val.pkl are also provided by the previous work.
- test_ids are also predefined for nuScenes dataset.
- Please link your Lyft dataset to the data folder.
- The required folders are listed below:
EfficientOCF
βββ data/
β βββ nuscenes/
β β βββ maps/
β β βββ samples/
β β βββ sweeps/
β β βββ lidarseg/
β β βββ v1.0-test/
β β βββ v1.0-trainval/
β β βββ nuscenes_occ_infos_train.pkl
β β βββ nuscenes_occ_infos_val.pkl
β βββ nuScenes-Occupancy/
β βββ lyft/
β β βββ maps/
β β βββ train_data/
β β βββ images/ # from train images, containing xxx.jpeg
β βββ efficientocf
β β βββ GMO/
β β β βββ ...
β β βββ GMO_lyft/
β β β βββ ...
β β βββ test_ids/Alternatively, you could manually modify the path parameters in the config files instead of using the default data structure, which are also listed here:
occ_path = "./data/nuScenes-Occupancy"
depth_gt_path = './data/depth_gt'
train_ann_file = "./data/nuscenes/nuscenes_occ_infos_train.pkl"
val_ann_file = "./data/nuscenes/nuscenes_occ_infos_val.pkl"
ocf_dataset_path = "./data/efficientocf/"
nusc_root = './data/nuscenes/'
We directly integrate the EfficientOCF dataset generation pipeline into the dataloader, so you can directly run training or evaluate scripts and just wait
For the nuScenes or nuScenes-Occupancy datasets, please run
bash run.sh ./projects/configs/baselines/EfficientOCF_V1.1.py 8For the Lyft dataset, please run
bash run.sh ./projects/configs/baselines/EfficientOCF_V1.1_lyft.py 8If you only want to test the performance of occupancy prediction for the present frame (current observation), please set test_present=True in the config files. Otherwise, forecasting performance on the future interval is evaluated.
bash run_eval.sh $PATH_TO_CFG $PATH_TO_CKPT $GPU_NUM
# e.g. bash run_eval.sh ./projects/configs/baselines/EfficientOCF_V1.1.py ./work_dirs/EfficientOCF_V1.1/epoch_15.pth 8Please set save_pred and save_path in the config files once saving prediction results are needed.
Here is some basic information and key parameters for EfficientOCF.
| Type | Info | Parameter |
|---|---|---|
| train | 23,930 sequences | train_capacity |
| val | 5,119 frames | test_capacity |
| voxel size | 0.2m | voxel_x/y/z |
| range | [-51.2m, -51.2m, -5m, 51.2m, 51.2m, 3m] | point_cloud_range |
| volume size | [512, 512, 40] | occ_size |
| classes | 2 for V1.1 / 9 for V1.2 | num_cls |
| observation frames | 3 | time_receptive_field |
| future frames | 4 | n_future_frames |
| extension frames | 6 | n_future_frames_plus |
Our proposed EfficientOCF can still perform well while being trained with partial data. Please try to decrease train_capacity if you want to explore more details with sparser supervision signals.
In addition, please make sure that n_future_frames_plus <= time_receptive_field + n_future_frames because n_future_frames_plus means the real prediction number. We estimate more frames including the past ones rather than only n_future_frames.
We thank the fantastic works Cam4DOcc, OpenOccupancy, PowerBEV, and FIERY for their pioneer code release, which provide codebase for this benchmark.