Duochao Shi* . Weijie Wang* · Donny Y. Chen · Zeyu Zhang · Jia-Wang Bian · Bohan Zhuang · Chunhua Shen
Paper | Project Page | Models
We introduce PM-Loss, a novel regularization loss based on a learned pointmap for feed-forward 3DGS, leading to more coherent 3D geometry and better rendering.
- 09/06/25 Update: Check out our ZPressor, a plug-and-play module that compresses multi-view inputs for scalable feed-forward 3DGS, enabling existing feed-forward 3DGS models to scale to over 100 input views!
Our code is developed based on pytorch 2.4.0, CUDA 12.4 and python 3.10.
We recommend using conda for installation:
git clone https://github.com/aim-uofa/PM-Loss
cd PM-Loss
conda create -y -n pmloss python=3.10
conda activate pmloss
pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124
pip install xformers==0.0.27.post2
pip install git+https://github.com/facebookresearch/pytorch3d.git # For training only
pip install -r requirements.txt
We also provide a pip-installable package for PM-Loss. If you want to use our loss function in your own project, you can install it directly by running:
pip install git+https://github.com/aim-uofa/PM-Loss#subdirectory=pmloss # need pytorch3d installed
For our view synthesis experiments with Gaussian splatting, we primarily use the DL3DV dataset for training. We evaluate our model on both DL3DV and RealEstate10K. For all experiments, we use a resolution of 256x448. Our data processing approach is adapted from previous works, including pixelSplat, MVSplat, and DepthSplat.
By default, we assume the datasets are placed in datasets/re10k
and datasets/DL3DV_480P
. Otherwise you will need to specify your dataset path with dataset.roots=[YOUR_DATASET_PATH]
in the running script.
For the test set, we use the DL3DV-Benchmark split, which contains 140 scenes for evaluation. For the training set, we use the DL3DV-480P dataset.
We provide a script, src/scripts/convert_dl3dv.py
, to process both the training and test sets. Running this script will convert the original data into the required format. Please note that you will need to update the dataset paths in the aforementioned processing scripts.
Please refer to here for acquiring the processed 360p dataset (360x640 resolution), which can be used directly in our codebase.
We release the version of PM-Loss based on DepthSplat architecture, specifically the version released in October 2024.
To render novel views and compute evaluation metrics from a pretrained model,
-
get our pretrained models, and save them to
checkpoints/
-
run the following:
# dl3dv
python -m src.main \
+experiment=re10k mode=test \
dataset/view_sampler=evaluation \
dataset.image_shape=[256,448] \
dataset.view_sampler.num_context_views=2 \
dataset.view_sampler.index_path=assets/re10k_bound_aware.json \
model.encoder.multiview_trans_nearest_n_views=3 \
model.encoder.costvolume_nearest_n_views=3 \
model.encoder.offset_mode=unconstrained \
checkpointing.pretrained_model=checkpoints/pmloss.ckpt \
test.compute_scores=true
# re10k
python -m src.main \
+experiment=re10k mode=test \
dataset/view_sampler=evaluation \
dataset.image_shape=[256,448] \
dataset.view_sampler.num_context_views=2 \
dataset.view_sampler.index_path=assets/re10k_bound_aware.json \
model.encoder.multiview_trans_nearest_n_views=3 \
model.encoder.costvolume_nearest_n_views=3 \
model.encoder.offset_mode=unconstrained \
checkpointing.pretrained_model=checkpoints/pmloss.ckpt \
test.compute_scores=true
-
Download pretrained weights of DepthSplat, and save them to
depthsplat_pretrained/
-
Run the following:
python -m src.main +experiment=dl3dv data_loader.train.batch_size=1 \
model.encoder.offset_mode=unconstrained \
loss=[mse,lpips,pcd] \
loss.pcd.weight=0.005 loss.pcd.gt_mode=vggt loss.pcd.ignore_large_loss=100.0 \
dataset.image_shape=[256,448] \
dataset.view_sampler.num_target_views=8 \
dataset.view_sampler.num_context_views=6 \
dataset.min_views=2 \
dataset.max_views=6 \
dataset.view_sampler.min_distance_between_context_views=20 \
dataset.view_sampler.max_distance_between_context_views=50 \
trainer.max_steps=100001
If you find our work useful for your research, please consider citing us:
@article{shi2025pmloss,
title={Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting},
author={Shi, Duochao and Wang, Weijie and Chen, Donny Y. and Zhang, Zeyu and Bian, Jiawang and Zhuang, Bohan and Shen, Chunhua},
journal={arXiv preprint arXiv:2506.05327},
year={2025}
}
If you have any questions, please create an issue on this repository or contact at dcshi@zju.edu.cn
This project is developed with several fantastic repos: VGGT, MVSplat and DepthSplat. We thank the original authors for their excellent work.