Skip to content

m80hz/g3splat

Repository files navigation

G3Splat

Geometrically Consistent Generalizable Gaussian Splatting

Mehdi Hosseinzadeh      Shin-Fang Chng      Yi Xu      Simon Lucey      Ian Reid      Ravi Garg

Project Page   arXiv   GitHub   Hugging Face   Demo

G3Splat is a pose-free, self-supervised framework for generalizable Gaussian splatting that achieves state-of-the-art performance in geometry reconstruction, relative pose estimation, and novel-view synthesis.

Teaser


✨ Highlights

  • 🎯 Pose-Free: No camera poses required at inference time
  • 🔄 Self-Supervised: Trained without ground-truth depth or 3D supervision
  • 🚀 Feed-Forward: Real-time inference with no per-scene optimization
  • 📐 Geometrically Consistent: Alignment and orientation losses for accurate 3D reconstruction
  • 🎨 Flexible: Supports both 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS)

📋 Table of Contents


🛠️ Installation

Our implementation requires Python 3.10+ and has been tested with PyTorch 2.1.2 and CUDA 11.8/12.1.

1. Clone the Repository

git clone https://github.com/m80hz/g3splat
cd g3splat

2. Create Environment and Install Dependencies

conda create -y -n g3splat python=3.10
conda activate g3splat
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
pip install "numpy<2"  # Required: PyTorch 2.1.2 is incompatible with NumPy 2.x
pip install -r requirements.txt

# Install CUDA rasterizers 
pip install git+https://github.com/rmurai0610/diff-gaussian-rasterization-w-pose.git --no-build-isolation
pip install git+https://github.com/hbb1/diff-surfel-rasterization.git --no-build-isolation

3. (Optional) Compile CUDA Kernels for RoPE

For faster inference, compile the CUDA kernels for RoPE positional embeddings:

cd src/model/encoder/backbone/croco/curope/
python setup.py build_ext --inplace
cd ../../../../../..

🏆 Model Zoo

We provide pretrained checkpoints on Hugging Face 🤗

Available Models

Model Backbone Gaussian Type Training Data Resolution Download
G³Splat-3DGS MASt3R 3DGS RealEstate10K 256×256 📥 Download
G³Splat-2DGS MASt3R 2DGS RealEstate10K 256×256 📥 Download

🔜 Coming Soon

Model Backbone Gaussian Type Status
G³Splat-VGGT-3DGS VGGT 3DGS 🚧 Coming Soon

Note: The code and checkpoints for G³Splat with the VGGT backbone will be released soon. Stay tuned for updates!

Downloading Models

Option 1: Direct Download

Download from the links in the table above and place in pretrained_weights/.

Option 2: Using Hugging Face Hub

pip install huggingface_hub
from huggingface_hub import hf_hub_download

# Download 3DGS model
hf_hub_download(
    repo_id="m80hz/g3splat",
    filename="g3splat_mast3r_3dgs_align_orient_re10k.ckpt",
    local_dir="pretrained_weights"
)

# Download 2DGS model
hf_hub_download(
    repo_id="m80hz/g3splat",
    filename="g3splat_mast3r_2dgs_align_orient_re10k.ckpt",
    local_dir="pretrained_weights"
)

Option 3: Using Git LFS

# Clone just the model files
git lfs install
git clone https://huggingface.co/m80hz/g3splat pretrained_weights

Model Configuration

Expected directory structure:

pretrained_weights/
├── g3splat_mast3r_3dgs_align_orient_re10k.ckpt
└── g3splat_mast3r_2dgs_align_orient_re10k.ckpt 

⚠️ Important: When using 2DGS models, you must set gaussian_type: 2d in the config:

# config/model/encoder/<backbone>.yaml  # e.g., noposplat.yaml, etc.
# (<backbone> is a placeholder for the encoder backbone config you are using)
gaussian_adapter:
  gaussian_type: 2d   # Use '3d' for 3DGS models (default)

Or pass it via command line: model.encoder.gaussian_adapter.gaussian_type=2d


🎮 Demo

We provide an interactive web demo powered by Gradio for visualizing G³Splat outputs.

Note: The demo is intended for quick visualization and verifying that the installation works correctly. To reproduce the quantitative results reported in the paper, please refer to the Evaluation section.

Quick Start

python demo.py --checkpoint pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt

Then open your browser at http://localhost:7860

Demo Features

  • 📸 Image Input: Upload custom image pairs or use provided examples
  • 🎯 Pose-Free Inference: No camera poses required
  • 🖼️ Novel View Synthesis: Visualize rendered novel views with adjustable interpolation based on estimated poses
  • 📊 Geometry Visualization: View depth maps, surface normals, and Gaussian normals
  • 🌐 Interactive 3D: Explore Gaussian splats in the browser
  • 💾 Export: Download PLY files for external visualization

Command Line Options

python demo.py \
    --checkpoint <path_to_checkpoint> \
    --port 7860 \                  # Server port
    --share                        # Create public Gradio link

Example Images

We provide example image pairs in assets/examples/ organized by dataset:

assets/examples/
├── re10k_001/           # RealEstate10K scene
│   ├── context_0.png
│   └── context_1.png
├── re10k_002/
│   ├── context_0.png
│   └── context_1.png
├── scannet_001/         # ScanNet scene
│   ├── context_0.png
│   └── context_1.png
└── ...

Scene folders are named with a dataset prefix (e.g., re10k_, scannet_) followed by a number. The demo automatically detects the dataset and uses appropriate camera intrinsics.


📦 Datasets

G³Splat is trained on RealEstate10K and evaluated zero-shot on multiple benchmarks.

Dataset Overview

Dataset Usage Task Download
RealEstate10K Training and Testing NVS, Pose 📥 Instructions
ACID Zero-shot NVS, Pose 📥 Instructions
ScanNet Zero-shot NVS, Pose, Depth, Mesh 📥 Instructions
NYU Depth V2 Zero-shot Single-View Depth 📥 Instructions

Expected Directory Structure

datasets/
├── re10k/
│   ├── train/
│   │   ├── 000000.torch
│   │   ├── ...
│   │   └── index.json
│   └── test/
│       ├── 000000.torch
│       ├── ...
│       └── index.json
├── acid/
│   ├── train/
│   │   ├── 000000.torch
│   │   ├── ...
│   │   └── index.json
│   └── test/
│       ├── 000000.torch
│       ├── ...
│       └── index.json
├── scannetv1_test/
│   ├── scene0664_00/
│   │   ├── color/
│   │   │    ├── 0.png
│   │   │    ...
│   │   ├── depth/
│   │   │    ├── 0.png
│   │   │    ...
│   │   ├── intrinsic/
│   │   │    ├── intrinsic_color.txt
│   │   │    └── intrinsic_depth.txt
│   │   ├── pose/
│   │   │    ├── 0.txt
│   │   │    ...
│   │   └── mesh/
│   │        └── scene0664_00_vh_clean_2.ply
│   ├── ...
│   └── scannet_test_pairs.txt
└── nyud_test/
    ├── color/
    │   └── 0001.png
    │    ...
    └── depth/
    │   └── 0001.png
    │    ...
    └── intrinsic_color.txt

Note: By default, datasets are expected in datasets/. Override with:

dataset.DATASET_NAME.roots=[/your/path]

Dataset Preparation

📁 RealEstate10K (Training)

We follow pixelSplat's data processing pipeline. See the pixelSplat dataset guide for instructions on downloading and processing the dataset (use the 360p version, recommended for 256×256 training). You can also download the preprocessed dataset directly from the same page.

📁 ACID (Zero-shot Evaluation)

Visit the ACID Dataset Page to download the raw data, then convert the dataset by following the instructions in the pixelSplat dataset guide. Alternatively, you can download the preprocessed version directly from the same guide.

📁 ScanNet (Zero-shot Evaluation)
  1. Request Access: Visit the ScanNet official page and request access to the dataset.

  2. Download Data: Once approved, download the ScanNet v1 test set, including color images, depth maps, camera poses, and mesh reconstructions. Use scripts/download_scannet_v1_test_meshes.sh to download the mesh files.

  3. Test Pairs: Download the test split file scannet_test_pairs.txt, which defines the image pairs for test scenes and context views.

📁 NYU Depth V2 (Zero-shot Single-View Depth)
  1. Visit the official dataset page: Go to the NYU Depth V2 Dataset website.

  2. Download the data: Follow the instructions on the page to obtain the dataset files you need.

  3. (Optional) Use the preprocessed test set: Download the preprocessed test split here: nyud_test.


Evaluation

Depth Evaluation

Multi-View Depth

ScanNet (Zero-shot)
python -m src.eval_depth +experiment=scannet_depth_align_orient +evaluation=eval_depth \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt

Single-View Depth

NYU Depth V2 (Zero-shot)
python -m src.eval_depth +experiment=nyud_depth_align_orient +evaluation=eval_depth \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt

💡 Tip: Add evaluation.use_pose_refinement=false to disable test-time pose refinement.

Metrics: AbsRel ↓ | δ<1.10 ↑ | δ<1.25 ↑


Pose Estimation

Evaluate relative camera pose estimation:

RealEstate10K
python -m src.eval_pose +experiment=re10k_align_orient +evaluation=eval_pose \
    dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
    dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt
ACID (Zero-shot)
python -m src.eval_pose +experiment=acid_align_orient +evaluation=eval_pose \
    dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
    dataset.re10k.view_sampler.index_path=assets/evaluation_index_acid.json \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt
ScanNet (Zero-shot)
python -m src.eval_pose +experiment=scannet_pose_align_orient +evaluation=eval_pose \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt

💡 Tip: Add evaluation.use_pose_refinement=false to disable test-time pose refinement.

Metrics: Rotation Error (°) ↓ | Translation Error (°) ↓ | AUC@5° ↑ | AUC@10° ↑ | AUC@20° ↑ | AUC@30° ↑


Novel View Synthesis

RealEstate10K
python -m src.main +experiment=re10k_align_orient_1x8 mode=test \
    dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
    dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt
ACID (Zero-shot)
python -m src.main +experiment=acid_align_orient_1x8 mode=test \
    dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
    dataset.re10k.view_sampler.index_path=assets/evaluation_index_acid.json \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt
ScanNet (Zero-shot)
python -m src.main +experiment=scannet_depth_align_orient mode=test \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt

💡 Tip: Set test.save_image=true and/or test.save_video=true to save rendered images and videos to the directory specified by test.output_path.

Metrics: PSNR ↑ | SSIM ↑ | LPIPS ↓


Mesh Evaluation

Evaluate 3D mesh reconstructions on ScanNet:

python -m src.eval_mesh +experiment=scannet_depth_align_orient +evaluation=eval_mesh \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt

Metrics: Accuracy ↓ | Completeness ↓ | Overall (Chamfer Distance) ↓


Export Gaussian PLY:

python -m src.main +experiment=re10k_align_orient_1x8 mode=test \
    dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
    dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
    checkpointing.load=pretrained_weights/g3splat_mast3r_3dgs_align_orient_re10k.ckpt \
    test.save_gaussian=true 

Evaluation Quick Reference

Task Dataset Script Experiment Config
Depth ScanNet src.eval_depth scannet_depth_align_orient
Depth NYU Depth V2 src.eval_depth nyud_depth_align_orient
Pose RE10K src.eval_pose re10k_align_orient
Pose ACID src.eval_pose acid_align_orient
Pose ScanNet src.eval_pose scannet_pose_align_orient
NVS RE10K src.main re10k_align_orient_1x8
NVS ACID src.main acid_align_orient_1x8
NVS ScanNet src.main scannet_depth_align_orient
Mesh ScanNet src.eval_mesh scannet_depth_align_orient

📜 Batch Evaluation: See scripts/eval_checkpoint.sh and scripts/all_evals.sh for unified scripts to run multiple evaluations.


Training

Prerequisites

Download the MASt3R pretrained weights:

mkdir -p pretrained_weights
wget -P pretrained_weights/ https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth

Training Commands

Multi-GPU Training (Recommended)
# 24× A100 GPUs (6 nodes × 4 GPUs), effective batch size 144
python -m src.main +experiment=re10k_align_orient \
    wandb.mode=online \
    wandb.name=g3splat_align_orient

Training Time: ~6 hours on 24× A100 (40GB)

SLURM Cluster: See slurm_train.sh for an example job script.

Single-GPU Training
# Single A6000 (48GB), batch size 8
python -m src.main +experiment=re10k_align_orient_1x8 \
    wandb.mode=online \
    wandb.name=g3splat_align_orient_1x8

Training Time: ~120 hours on 1× A6000

Training 2DGS Variant
python -m src.main +experiment=re10k_align_orient \
    model.encoder.gaussian_adapter.gaussian_type=2d \
    wandb.mode=online \
    wandb.name=g3splat_2dgs_align_orient

Training Configurations

Config Hardware Batch Size Training Time
re10k_align_orient 24× A100 144 ~6 hours
re10k_align_orient_1x8 1× A6000 8 ~120 hours
re10k_align 24× A100 144 ~6 hours
re10k_orient 24× A100 144 ~6 hours

💡 Tip: When changing batch size, adjust learning rate and training steps proportionally for optimal convergence.


Acknowledgements

This project is developed with several repositories: VGGT, NoPoSplat, MASt3R, DUSt3R, pixelSplat, and CUT3R. We thank all the authors for their contributions to the community.


Citation

If you find G³Splat useful in your research, please consider citing:

@inproceedings{g3splat,
  title     = {G3Splat: Geometrically Consistent Generalizable Gaussian Splatting},
  author    = {Hosseinzadeh, Mehdi and Chng, Shin-Fang and Xu, Yi and Lucey, Simon and Reid, Ian and Garg, Ravi},
  booktitle = {arXiv:2512.17547},
  year      = {2025},
  url       = {https://arxiv.org/abs/2512.17547}
}

⭐ Star us on GitHub if you find this project useful! ⭐

Questions? Feel free to open an issue or reach out!

About

G3Splat: Geometrically Consistent Generalizable Gaussian Splatting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages