F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting

A PyTorch implementation of F4Splat (Kim et al., 2026), which performs spatially adaptive Gaussian allocation for feed-forward 3D Gaussian Splatting from sparse, uncalibrated images.

Key Features

Predictive Densification: Learns a densification score that predicts where additional Gaussians should be allocated, without iterative optimization.
Budget-Controllable: Adjust the total number of Gaussians at inference time via a single threshold — no retraining required.
Spatially Adaptive Allocation: Concentrates Gaussians on geometrically complex regions while avoiding redundancy in simple or overlapping areas.
Uncalibrated Input: Works directly from uncalibrated multi-view images with joint camera parameter estimation.

Architecture

Context Images
      |
      v
 [DINOv2 Encoder]  (frozen)
      |
      v
 [VGGT-style Backbone]  (frame-wise + global self-attention)
      |
      +---> Camera Head ---> Intrinsics K, Extrinsics T
      |
      v
 [DPT Decoder]  (multi-scale feature maps at L=3 levels)
      |
      +---> Gaussian Center Head ---> Depth maps ---> 3D centers
      |
      +---> Gaussian Param Head  ---> Opacity, Rotation, Scale, SH, Densification Scores
      |
      v
 [Spatially Adaptive Allocation]  (threshold tau + budget matching)
      |
      v
 [gsplat Renderer] ---> Novel View Synthesis

Project Structure

f4splat/
├── config.py                 # All configuration dataclasses
├── model/
│   ├── backbone.py           # DINOv2 + VGGT-style geometry backbone
│   ├── decoder.py            # Multi-scale DPT decoder (L=3 levels)
│   ├── heads.py              # Gaussian Center Head + Parameter Head
│   └── f4splat_model.py      # Unified model (train & inference pipelines)
├── gaussian/
│   ├── allocation.py         # Adaptive allocation masks + budget matching (Alg. 1 & 2)
│   └── renderer.py           # gsplat differentiable rendering wrapper
├── loss/
│   └── losses.py             # Rendering, score, camera, and scene-scale losses
├── data/
│   └── dataset.py            # RealEstate10K & ACID dataset loaders
├── utils/
│   ├── geometry.py           # Sim(3) alignment, camera ops, quaternion helpers
│   └── metrics.py            # PSNR, SSIM
├── train.py                  # Distributed training (DDP + AMP)
└── eval.py                   # Evaluation script

Installation

# Clone
git clone https://github.com/<your-username>/f4splat.git
cd f4splat

# Install dependencies
pip install -r requirements.txt

Requirements

Python >= 3.10
PyTorch >= 2.1
gsplat >= 1.0
NVIDIA GPU (training was validated on H200)

Data Preparation

Download RealEstate10K and/or ACID and organize as:

data/
├── re10k/
│   ├── train/
│   │   └── <scene_id>/
│   │       ├── images/
│   │       │   ├── 000000.png
│   │       │   └── ...
│   │       └── cameras.npz    # intrinsics (N,3,3), extrinsics (N,4,4)
│   └── test/
│       └── ...
└── acid/
    └── ...

Training

Multi-view (default)

# 8 GPUs, ~15 hours
NUM_GPUS=8 bash scripts/train_multiview.sh

Two-view

bash scripts/train_twoview.sh re10k

Single-GPU (debug)

python -m f4splat.train \
    --dataset re10k \
    --data-root ./data \
    --output-dir ./outputs \
    --max-iterations 15000 \
    --image-size 256

Key arguments:

Argument	Default	Description
`--batch-images`	24	Total images per iteration (batch size adapts to view count)
`--lr`	2e-4	Learning rate
`--two-view`	off	Two-view training mode
`--no-amp`	off	Disable mixed precision

Evaluation

python -m f4splat.eval \
    --checkpoint outputs/checkpoint_final.pt \
    --dataset re10k \
    --data-root ./data \
    --n-views 8 16 24

To evaluate with a specific Gaussian budget:

python -m f4splat.eval \
    --checkpoint outputs/checkpoint_final.pt \
    --target-gaussians 500000

Method Overview

Densification Score

During training, the network learns to predict a per-region densification score from input images alone. The ground-truth signal is derived from the rendering loss gradient:

d_g = log(1 + 1e4 * ||v_g||_2)

where v_g is the accumulated view-space positional gradient of each Gaussian.

Spatially Adaptive Allocation

Given multi-scale Gaussian maps at L=3 levels and a threshold tau:

Coarsest level: Allocate where score < tau (simple regions)
Intermediate levels: Allocate where score < tau AND not already covered by coarser levels
Finest level: Allocate everything remaining

Budget Matching (Algorithms 1 & 2)

At inference, given a target Gaussian count, binary search finds the threshold tau that produces the closest match — no retraining needed.

Loss Functions

L_total = L_render + 1e-4 * L_score + 10 * L_camera + 1e-2 * L_scene

Loss	Description
`L_render`	MSE + 0.05 * LPIPS between rendered and target novel views
`L_score`	L1 between predicted and gradient-based densification scores
`L_camera`	Geodesic rotation + L2 translation error
`L_scene`	Regularizes average Gaussian center distance to 1

Citation

@article{kim2026f4splat,
  title={F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting},
  author={Kim, Injae and Kim, Chaehyeon and Bae, Minseong and Joo, Minseok and Kim, Hyunwoo J.},
  journal={arXiv preprint arXiv:2603.21304},
  year={2026}
}

Acknowledgments

This implementation builds upon:

VGGT — Geometry backbone architecture
DINOv2 — Image encoder
gsplat — Differentiable Gaussian rasterization
NoPoSplat — RGB shortcut and training strategy

License

This project is released for research purposes. Please refer to the original paper and upstream licenses for usage terms.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
f4splat		f4splat
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting

Key Features

Architecture

Project Structure

Installation

Requirements

Data Preparation

Training

Multi-view (default)

Two-view

Single-GPU (debug)

Evaluation

Method Overview

Densification Score

Spatially Adaptive Allocation

Budget Matching (Algorithms 1 & 2)

Loss Functions

Citation

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting

Key Features

Architecture

Project Structure

Installation

Requirements

Data Preparation

Training

Multi-view (default)

Two-view

Single-GPU (debug)

Evaluation

Method Overview

Densification Score

Spatially Adaptive Allocation

Budget Matching (Algorithms 1 & 2)

Loss Functions

Citation

Acknowledgments

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages