Highlights
- Pro
Stars
Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM (CVPR 2024)
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
High Resolution Depth Maps for Stable Diffusion WebUI
[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer
[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs
[CVPR'22] ICON: Implicit Clothed humans Obtained from Normals
an implementation of 3D Ken Burns Effect from a Single Image using PyTorch
Official code of "HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation", CVPR 2021
ViPE: Video Pose Engine for Geometric 3D Perception
Official code for the paper "LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes".
Official Code for MotionCtrl [SIGGRAPH 2024]
[TPAMI 2025] ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition (CVPR2023)
[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
[CVPR 2023] BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
[CVPR 2024 Highlight] PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
[CVPR'23, Highlight] ECON: Explicit Clothed humans Optimized via Normal integration
Official implementation of Continuous 3D Perception Model with Persistent State
Code for 3D-LLM: Injecting the 3D World into Large Language Models
A simulation platform for versatile Embodied AI research and developments.
[NeurIPS 2024] An official implementation of "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions"
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).
Universal Monocular Metric Depth Estimation