Stars
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
This is the official implementation of our paper, UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation
SPEAR: A Simulator for Photorealistic Embodied AI Research
L3ROcc is a high-performance visual geometry framework designed to transform standard RGB video sequences into high-precision 3D Point Clouds, 3D Occupancy Grids, and 4D Temporal Observation Data.
L3ROcc is a high-performance visual geometry framework designed to transform standard RGB video sequences into high-precision 3D Point Clouds, 3D Occupancy Grids, and 4D Temporal Observation Data.
[ICCV'25 Best Paper Finalist] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Pointcept: Perceive the world with sparse points, a codebase for point cloud perception research. Latest works: Utonia (ICML'26), Concerto (NeurIPS'25), Sonata (CVPR'25 Highlight), PTv3 (CVPR'24 Oral)
We provide a way to fuse MANO parameters into SMPLX.
Porting the MANO hand model to the PyBullet simulator
Code for paper [Reconstructing Objects along Hand Interaction Timelines in Egocentric Video]
[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)
A Blender add-on for importing a sequence of OBJ meshes as frames
This repository collects papers on Human-Interaction-Motion-Generation applications. We will update new papers irregularly.
ViPE: Video Pose Engine for Geometric 3D Perception
🎤 Register Any Point: Scaling 3D Point Cloud Registration by Flow Matching [ECCV 26]
Anny, A Free and Interpretable Human Body Model for all ages, written in PyTorch.
A markup-based typesetting system that is powerful and easy to learn.
[CVPR'24 Best Student Paper] Mip-Splatting: Alias-free 3D Gaussian Splatting
[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[CVPR 2025] Open-World Amodal Appearance Completion
Official implementation of ICCV 2025 paper "TACO: Taming Diffusion for in-the-wild Video Amodal Completion"
[CVPR 2025] Official code for Using Diffusion Priors for Video Amodal Segmentation
CoTracker is a model for tracking any point (pixel) on a video.
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image (CVPR 2026)
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects