- Paris
Stars
The official implementation of InfiniteVGGT
Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching
Official implementation of Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model
Official implementation of "MV-TAP: Tracking Any Point in Multi-View Videos"
The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Official implementation of "S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation, ICCV 2025"
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction
Wrapper of 50+ image matching models with a unified interface
A simple state update rule to enhance length generalization for CUT3R
HISDF (Human Instance, Skeleton, and Depth Fusion) is a unified model that fuses human instance segmentation, skeletal structure estimation, and depth prediction to achieve holistic human perceptio…
[NeurIPS 2025] Pixel-Perfect Depth
VGGT-X: When VGGT Meets Dense Novel View Synthesis
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Reference PyTorch implementation and models for DINOv3
[CORL 2025 Oral]One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation.
ViPE: Video Pose Engine for Geometric 3D Perception
[ICLR2026] Official Implementation of "Dens3R: A Foundation Model for 3D Geometry Prediction"
[ICLR 2026] Streaming 4D Visual Geometry Transformer
The official implementation of ICCV'25 paper "FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution"
Tools for simple inference testing using CPU, WebGL, WebGPU. Simple Inference Test for onnxruntime-web.
Distributed Robot Interaction Dataset.
A markup-based typesetting system that is powerful and easy to learn.
[SPL 2025] LYT-Net: Lightweight YUV Transformer-based Network for Low-Light Image Enhancement
Visualizing the DROID dataset using Rerun
Trackers gives you clean, modular re-implementations of leading multi-object tracking algorithms released under the permissive Apache 2.0 license. You combine them with any detection model you alre…