Skip to content

Parskatt/RoMaV2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RoMa v2 🤖: Harder Better Faster Denser Feature Matching


example

How to Use

from romav2 import RoMaV2

# load pretrained model
model = RoMaV2()
# Match densely for any image-like pair of inputs
preds = model.match(img_A_path, img_B_path)

# you can also run the forward method directly as 
# preds = model(img_A, img_B)

# Sample 5000 matches for estimation
matches, overlaps, precision_AB, precision_BA = model.sample(preds, 5000)

# Convert to pixel coordinates (RoMaV2 produces matches in [-1,1]x[-1,1])
kptsA, kptsB = model.to_pixel_coordinates(matches, H_A, W_A, H_B, W_B)

# Find a fundamental matrix (or anything else of interest)
F, mask = cv2.findFundamentalMat(
    kptsA.cpu().numpy(), kptsB.cpu().numpy(), ransacReprojThreshold=0.2, method=cv2.USAC_MAGSAC, confidence=0.999999, maxIters=10000
)

We additionally provide two demos in the demos folder, which might help in understanding.

Setup/Install

In your python environment (tested on Linux python 3.12), run:

uv pip install -e .

or

uv sync

Benchmarks

If you do not already have MegaDepth and ScanNet, you can the following to download them:

source scripts/eval_prep.sh

Mega-1500

uv run tests/test_mega1500.py

ScanNet-1500

uv run tests/test_scannet1500.py

Expected Results

Experiments on ScanNet-1500 and MegaDepth-1500 are provided in the tests folder. Running these gave me ScanNet-1500: [34.0, 56.5, 73.9], and Mega-1500: [62.8, 76,8, 86.5], which are similar to the results of the paper.

Fused local correlation kernel

Include the --extra fused-local-corr flag as:

uv sync --extra fused-local-corr

or

uv pip install romav2[fused-local-corr]

or

uv add romav2[fused-local-corr]

Settings

By twiddling with some different settings you may reach better results on your task of interest. Some important ones, which we enable setting to some reasonable defaults through model.apply_setting, are:

model.H_lr, model.W_lr: height and width for the image pair.

model.H_hr, model.W_hr: height and width for a high resolution version of the image pair (used for upsampling as in RoMa)

model.bidirectional: Useful for getting more diverse matches, and for estimating the covariance matrix in both directions.

model.threshold: Value between [0,1]. Used to set overlap prediction above it to 1. Useful for Mega1500.

model.balanced_sampling: Diverse sampling, same as RoMa. Typically helps to get better RANSAC estimates.

License

All our code except DINOv3 is MIT license. DINOv3 has a custom license, see DINOv3.

Acknowledgement

Our codebase builds mainly on the code in RoMa. We were additionally inspired by UFM and MapAnything, particularly for the datasets used to train the models.

BibTeX

If you find our models useful, please consider citing our paper!

@article{edstedt2025romav2,
  title={{RoMa v2: Harder Better Faster Denser Feature Matching}},
  author={Johan Edstedt, David Nordström, Yushan Zhang, Georg Bökman, Jonathan Astermark, Viktor Larsson, Anders Heyden, Fredrik Kahl, Mårten Wadenbäck, Michael Felsberg},
  journal={arXiv preprint arXiv:2511.15706},
  year={2025}
}