Johan Edstedt · David Nordström · Yushan Zhang · Georg Bökman · Jonathan Astermark · Viktor Larsson · Anders Heyden · Fredrik Kahl · Mårten Wadenbäck · Michael Felsberg
from romav2 import RoMaV2
# load pretrained model
model = RoMaV2()
# Match densely for any image-like pair of inputs
preds = model.match(img_A_path, img_B_path)
# you can also run the forward method directly as
# preds = model(img_A, img_B)
# Sample 5000 matches for estimation
matches, overlaps, precision_AB, precision_BA = model.sample(preds, 5000)
# Convert to pixel coordinates (RoMaV2 produces matches in [-1,1]x[-1,1])
kptsA, kptsB = model.to_pixel_coordinates(matches, H_A, W_A, H_B, W_B)
# Find a fundamental matrix (or anything else of interest)
F, mask = cv2.findFundamentalMat(
kptsA.cpu().numpy(), kptsB.cpu().numpy(), ransacReprojThreshold=0.2, method=cv2.USAC_MAGSAC, confidence=0.999999, maxIters=10000
)We additionally provide two demos in the demos folder, which might help in understanding.
In your python environment (tested on Linux python 3.12), run:
uv pip install -e .or
uv syncIf you do not already have MegaDepth and ScanNet, you can the following to download them:
source scripts/eval_prep.shuv run tests/test_mega1500.pyuv run tests/test_scannet1500.pyExperiments on ScanNet-1500 and MegaDepth-1500 are provided in the tests folder.
Running these gave me ScanNet-1500: [34.0, 56.5, 73.9], and Mega-1500: [62.8, 76,8, 86.5], which are similar to the results of the paper.
Include the --extra fused-local-corr flag as:
uv sync --extra fused-local-corror
uv pip install romav2[fused-local-corr]or
uv add romav2[fused-local-corr]By twiddling with some different settings you may reach better results on your task of interest.
Some important ones, which we enable setting to some reasonable defaults through model.apply_setting, are:
model.H_lr, model.W_lr: height and width for the image pair.
model.H_hr, model.W_hr: height and width for a high resolution version of the image pair (used for upsampling as in RoMa)
model.bidirectional: Useful for getting more diverse matches, and for estimating the covariance matrix in both directions.
model.threshold: Value between [0,1]. Used to set overlap prediction above it to 1. Useful for Mega1500.
model.balanced_sampling: Diverse sampling, same as RoMa. Typically helps to get better RANSAC estimates.
All our code except DINOv3 is MIT license. DINOv3 has a custom license, see DINOv3.
Our codebase builds mainly on the code in RoMa. We were additionally inspired by UFM and MapAnything, particularly for the datasets used to train the models.
If you find our models useful, please consider citing our paper!
@article{edstedt2025romav2,
title={{RoMa v2: Harder Better Faster Denser Feature Matching}},
author={Johan Edstedt, David Nordström, Yushan Zhang, Georg Bökman, Jonathan Astermark, Viktor Larsson, Anders Heyden, Fredrik Kahl, Mårten Wadenbäck, Michael Felsberg},
journal={arXiv preprint arXiv:2511.15706},
year={2025}
}