-
Huazhong University of Science & Technology
- Wuhan, Hubei Province, China
-
12:50
(UTC +08:00) - https://orcid.org/0009-0009-4752-6118
- @THELMDOFZHOUXIN
- https://lmd0311.github.io/
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Code to load DreamZero model checkpoints and run evaluation on DROID-sim and Genie Sim 3.0
Causal video-action world model for generalist robot control
Gen3R: 3D Scene Generation Meets Feed-Forward Reconstruction
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation
Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.
[AAAI 2026] WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving
Official code of Motus: A Unified Latent Action World Model
Towards Scalable Pre-training of Visual Tokenizers for Generation
[ICLR 2026] The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
Native and Compact Structured Latents for 3D Generation
[Tutorial] Few-Step Distillation for Text-to-Image Generation: A Practical Guide
Official code of “MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning”
Official Implementation of Particulate: Feed-Forward 3D Object Articulation
Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder".
The official implementation of The paper "Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation"
Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?
Repository of the survey: Progressive Robustness-Aware World Models in Autonomous Driving: A Review and Outlook
A V2V framework that translates human interaction videos into robot manipulation videos.
RynnVLA-002: A Unified Vision-Language-Action and World Model
[ICLR 2026] Astra : General Interactive World Model with Autoregressive Denoising"
🌐 WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World
[NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"