Skip to content
View lh-zhu's full-sized avatar
🀄
“还耍呢?红中老大。”
🀄
“还耍呢?红中老大。”
  • Huazhong University of Science and Technology

Organizations

@hustvl @baaivision

Block or report lh-zhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 7 Updated Mar 23, 2026

Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning

Python 50 1 Updated Mar 25, 2026

An hardware-aware Efficient Implementation for "Mixture-of-Depths Attention".

Python 270 9 Updated May 6, 2026

[ECCV 2026] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models

Python 148 9 Updated Dec 25, 2025

Visual Generation Tuning

Python 101 1 Updated Apr 16, 2026

[ArXiv 2025] MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices

Python 86 5 Updated May 20, 2026

Native Multimodal Models are World Learners

Python 1,526 67 Updated Dec 30, 2025

[ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models

Python 326 10 Updated Apr 24, 2025

[NeurIPS 2025] Pixel-Perfect Depth

Python 1,047 38 Updated Feb 13, 2026

RoMeO: Robust Metric Visual Odometry

29 1 Updated Dec 16, 2024

[ICCV 2025] ZeroStereo: Zero-Shot Stereo Matching from Single Images

Python 60 8 Updated May 21, 2026

MonSter++: A Unified Geometric Foundation Model for Stereo and Multi-View Depth Estimation via the Unleashing of Monodepth Priors

Python 266 20 Updated Dec 23, 2025

【CVPR 2024】Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving

Python 166 12 Updated Mar 25, 2024

[TPAMI 2024 & CVPR 2022] Attention Concatenation Volume for Accurate and Efficient Stereo Matching

Python 484 57 Updated Apr 18, 2025

【CVPR 2025 Highlight】MonSter: Marry Monodepth to Stereo Unleashes Power

Python 697 51 Updated Dec 2, 2025

带GUI的DIP项目

C++ 6 1 Updated Mar 31, 2023

The repository of "Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds"

Python 40 1 Updated Sep 1, 2025

[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning

Python 133 9 Updated Dec 3, 2025

[ICLR 2026] ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving

Python 548 69 Updated Jun 20, 2026

Official implementation of T-PAMI25 paper "M²Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes"

Python 122 7 Updated Jun 17, 2025

PixelHacker: Image Inpainting with Structural and Semantic Consistency

Python 568 66 Updated Jun 20, 2026

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,580 65 Updated Jun 14, 2025
BibTeX Style 1,543 377 Updated Mar 19, 2026

[ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding

Python 77 1 Updated Jun 26, 2025

The first decoder-only multimodal state space model

Python 104 6 Updated May 19, 2025

[CVPR'25 Highlight] You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale

Python 719 19 Updated Apr 16, 2025

[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Python 1,497 57 Updated Dec 16, 2025

[CVPR 2025] Prompt Depth Anything

Python 1,124 71 Updated Jan 29, 2026

[CVPR 2025] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding

Python 215 13 Updated Jan 5, 2026

[CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving

Python 1,428 136 Updated Dec 8, 2025
Next