- Paris
Stars
A simple state update rule to enhance length generalization for CUT3R
HISDF (Human Instance, Skeleton, and Depth Fusion) is a unified model that fuses human instance segmentation, skeletal structure estimation, and depth prediction to achieve holistic human perceptio…
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Reference PyTorch implementation and models for DINOv3
[CORL 2025 Oral]One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation.
ViPE: Video Pose Engine for Geometric 3D Perception
Official Implementation of "Dens3R: A Foundation Model for 3D Geometry Prediction"
Code for Streaming 4D Visual Geometry Transformer
Official code repository for FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Tools for simple inference testing using CPU, WebGL, WebGPU. Simple Inference Test for onnxruntime-web.
Distributed Robot Interaction Dataset.
A new markup-based typesetting system that is powerful and easy to learn.
[SPL 2025] LYT-Net: Lightweight YUV Transformer-based Network for Low-Light Image Enhancement
Visualizing the DROID dataset using Rerun
A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithms
[ICCV'25 Oral] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Repository for performing feature matching using the DeDoDev2 model in PyTorch
Making reading and writing OpenEXR images in python easy using numpy arrays.
Merging YOLOv9 and DepthAnythingV2
[CVPR 2025] Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
Ranking Google Scholar search results based on the number of citations
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
[CVPR2025 && NTIRE2025] HVI: A New Color Space for Low-light Image Enhancement (Official Implementation)
[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
[CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching