Stars
Official repository for "AM-RADIO: Reduce All Domains Into One"
[ICLR 2026] Large Depth Completion Model from Sparse Observations
metalbot is an iPhone-first autonomous RC car project.
[CVPR 2025] RollingDepth: Video Depth without Video Models
All-in-one training for vision models (YOLO, ViTs, RT-DETR, DINOv3): pretraining, fine-tuning, distillation.
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
Official implementation of "DepthMaster: Taming Diffusion Models for Monocular Depth Estimation".
Denoising Diffusion Probabilistic Models
A curated list of recent style transfer methods with diffusion models
DOC-Depth: A novel approach for dense depth ground truth generation
Cosmos-Transfer2.5, built on top of Cosmos-Predict2.5, produces high-quality world simulations conditioned on multiple spatial control inputs.
Implementation of the paper "DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients"
Python composable command line interface toolkit
Official implementation for HybridDepth Model [WACV 2025, ISMAR 2024]
[NeurIPS 2025] DualFocus: Depth from Focus with Spatio-Focal Dual Variational Constraints
Official repo for: Epipolar Geometry Improves Video Generation Models
(3DV 2026 Oral) L4P -- a feed-forward foundational model designed for multiple low-level 4D vision perception tasks.
[CVPR2026] Detect Anything via Next Point Prediction
Depth-Anything-V2 tinygrad implementation
[DEIMv2] Real Time Object Detection Meets DINOv3
Official implementation of Continuous 3D Perception Model with Persistent State
The official repository of the paper "DCDepth: Progressive Monocular Depth Estimationin in Discrete Cosine Domain" (NeurIPS-2024)
Visual Perception Engine: fast and flexible framework designed to run multiple perception models in an optimized and concurrent manner on NVIDIA Jetson