-
Huazhong University of Science & Technology
- Wuhan, Hubei Province, China
-
21:38
(UTC +08:00) - https://orcid.org/0009-0009-4752-6118
- @THELMDOFZHOUXIN
- https://lmd0311.github.io/
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"
WorldPlay: Interactive World Modeling with Real-Time Latency and Geometric Consistency
Native and Compact Structured Latents for 3D Generation
[Tutorial] Few-Step Distillation for Text-to-Image Generation: A Practical Guide
Official code of “MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning”
Official Implementation of Particulate: Feed-Forward 3D Object Articulation
Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder".
The official implementation of The paper "Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation"
Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?
Repository of the survey: Progressive Robustness-Aware World Models in Autonomous Driving: A Review and Outlook
A V2V framework that translates human interaction videos into robot manipulation videos.
RynnVLA-002: A Unified Vision-Language-Action and World Model
The official repository of "Astra : General Interactive World Model with Autoregressive Denoising"
🌐 WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World
[NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"
Running VLA at 30Hz frame rate and 480Hz trajectory frequency
DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images
DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Official implementation of "C3G: Learning Compact 3D Representations with 2K Gaussians"
[Arxiv] Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration
Learning to Drive via Real-World Simulation at Scale
G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
HunyuanVideo-1.5: A leading lightweight video generation model
[AAAI 2026 Oral] Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution