-
Zhejiang University
- Yuquan Campus, Hangzhou, Zhejiang, China
Highlights
- Pro
Stars
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Reference PyTorch implementation and models for DINOv3
The official code for PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction [IROS 2024 Oral]
The evaluation method of the accurate and efficient RGB-colorized reconstruction
[ICCV 2025 Highlight] DIMO: Diverse 3D Motion Generation for Arbitrary Objects
Collect some World Models for Autonomous Driving (and Robotic, etc.) papers.
[ICLR2025] A PyTorch implementation for STORM: Spatiotemporal Reconstruction Model for Large-Scale Outdoor Scenes
(ICLR2025) Enhancing End-to-End Autonomous Driving with Latent World Model
[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
[ICLR 2025 Oral] Official code for "LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias"
[AAAI 2026] OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model
This is the official implementation of UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving
[IEEE T-PAMI 2024] All you need for End-to-end Autonomous Driving
[CVPR 2025] UniScene: Unified Occupancy-centric Driving Scene Generation
[CVPR 2025] Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Offical code for PanopticRecon++ (PR++)
Official implementation of "DepthLab: From Partial to Complete"
[CVPR 2025] StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models
[CVPR'25 Highlight] You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
[ICCV 2025] Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model
[CVPR 2024 Highlight] XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry
[ICCV'25]DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
🎞️ [NeurIPS'24] MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views
A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain, including papers, codes, and related websites