-
University of Michigan
- Ann Arbor
- jinlinyi.github.io
Highlights
- Pro
Stars
Unofficial DynaDUSt3R reimplementation trained on Stereo4D (research only).
[ICLR 2026] Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
Code release for paper "Test-Time Training Done Right"
PreciseCam: Precise Camera Control for Text-to-Image Generation
[ICCV 2025] SpatialTrackerV2: 3D Point Tracking Made Easy
[NeurIPS 2025] Sekai: A Video Dataset towards World Exploration
Stereo4D dataset and processing code
Library for reading and processing ML training data.
[CVPR 2025] Code for Segment Any Motion in Videos
[CVPR 2025 Best Paper Nomination] FoundationStereo: Zero-Shot Stereo Matching
The official implementation of CVPR'25 Oral paper "Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise"
Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"
[CVPR 2025] Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail
[ICCV 2025] Official Code for Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration
[ICLR'25 Oral] No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
GeoCalib: Learning Single-image Calibration with Geometric Optimization (ECCV 2024)
Efficiently Composable Data Augmentation on the GPU with Jax
[CVPR 2025 Highlight] DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
[CVPR 2024 Highlight] Official PyTorch implementation of SpatialTracker: Tracking Any 2D Pixels in 3D Space
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation
HInt dataset from HaMeR: Reconstructing Hands in 3D with Transformers
Lightplane implements a highly memory-efficient differentiable radiance field renderer, and a module for unprojecting features from images to 3D grids.
[ECCV 2024 - Oral] ACE0 is a learning-based structure-from-motion approach that estimates camera parameters of sets of images by learning a multi-view consistent, implicit scene representation.