-
Huazhong University of Science and Technology
- Wuhan,China
-
03:43
(UTC +08:00) - https://scholar.google.com.hk/citations?user=jEjHhDUAAAAJ&hl=zh-CN
Lists (4)
Sort Name ascending (A-Z)
Stars
[SIGGRAPH2026] Official code for SIGGRAPH2026 paper: R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow
[NeurIPS 2025] ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS
[NeurIPS 2025 (Spotlight)] The implementation for the paper "4DGT Learning a 4D Gaussian Transformer Using Real-World Monocular Videos"
[ICLR 2026] StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams
Official Implementation of CoInteract: Spatially-Structured Co-Generation for Interactive Human-Object Video Synthesis
[ICML 2026] World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
[ICLR'25 Oral] No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
[ECCV 2022] An End-to-End Transformer Model for Crowd Localization
Official code, models, and data for Vista4D: Video Reshooting with 4D Point Clouds (CVPR 2026 Highlight)
[NeurIPS 2025] VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
[ICLR 2026] π^3: Permutation-Equivariant Visual Geometry Learning
A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…
AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model
UnrealCV: Connecting Computer Vision to Unreal Engine
UnrealZoo / unrealzoo-gym
Forked from zfw1226/gym-unrealcv[ICCV 2025 Highlights] Large-scale photo-realistic virtual worlds for embodied AI
Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model [ICLR2026]
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
AI agents running research on single-GPU nanochat training automatically
[CVPR '26] SceneTok: A Compressed, Diffusable Token Space for 3D Scenes
UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving
TartanAir dataset tools and samples
A Python package for the TartanAir-V2 dataset.
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
[ICCV 25] Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
open-sourced video dataset with dynamic scenes and camera movements annotation
A toolkit for computing Fréchet Inception Distance (FID) & Fréchet Video Distance (FVD) metrics.