Multi-view Pyramid Transformer: Look Coarser to See Broader

Gyeongjin Kang, Seungkwon Yang, Seungtae Nam, Younggeun Lee, Jungwoo Kim, Eunbyung Park

Official repo for the paper "Multi-view Pyramid Transformer: Look Coarser to See Broader"

Installation

# create conda environment
conda create -n mvp python=3.11 -y
conda activate mvp

# install PyTorch (adjust cuda version according to your system)
pip install -r requirements.txt
pip install git+https://github.com/nerfstudio-project/gsplat.git

Checkpoints

The model checkpoints are host on HuggingFace (mvp_960x540).

For training and evaluation, we used the DL3DV dataset after applying undistortion preprocessing with this script, originally introduced in Long-LRM.

Download the DL3DV benchmark dataset from here, and apply undistortion preprocessing.

Inference

Update the inference.ckpt_path field in configs/inference.yaml with the pretrained model.

Update the entries in data/dl3dv_eval.txt to point to the correct processed dataset path.

# inference
CUDA_VISIBLE_DEVICES=0 python inference.py --config configs/inference.yaml

Train

Update the configs/api_keys.yaml with your own personal wandb api key.

Update the entries in data/dl3dv_train.txt to point to the correct processed dataset path.

# Example for single GPU training
CUDA_VISIBLE_DEVICES=0 python train_single.py --config configs/train_stage1.yaml

# Example for multi GPU training
torchrun --nproc_per_node 8 --nnodes 1 \
         --rdzv_id 1234 --rdzv_endpoint localhost:8888 \
         train.py --config configs/train_stage1.yaml

TODO List

Training code (Stage 3)
Preprocessed Tanks&Temple and Mip-NeRF360 dataset

Citation

@article{kang2025multi,
  title={Multi-view Pyramid Transformer: Look Coarser to See Broader},
  author={Kang, Gyeongjin and Yang, Seungkwon and Nam, Seungtae and Lee, Younggeun and Kim, Jungwoo and Park, Eunbyung},
  journal={arXiv preprint arXiv:2512.07806},
  year={2025}
}

Acknowledgements

This project is built on many amazing research works, thanks a lot to all the authors for sharing!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
configs		configs
data		data
LICENSE		LICENSE
README.md		README.md
camera_utils.py		camera_utils.py
dataset.py		dataset.py
dpt_head.py		dpt_head.py
inference.py		inference.py
loss.py		loss.py
metric_utils.py		metric_utils.py
model.py		model.py
prope_custom.py		prope_custom.py
requirements.txt		requirements.txt
setup.py		setup.py
torch_impl.py		torch_impl.py
train.py		train.py
train_single.py		train_single.py
training_utils.py		training_utils.py
transformer.py		transformer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-view Pyramid Transformer: Look Coarser to See Broader

Installation

Checkpoints

Inference

Train

TODO List

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

Gynjn/MVP

Folders and files

Latest commit

History

Repository files navigation

Multi-view Pyramid Transformer: Look Coarser to See Broader

Installation

Checkpoints

Inference

Train

TODO List

Citation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages