Skip to content
View LMD0311's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Organizations

@H-EmbodVis

Block or report LMD0311

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
370 results for source starred repositories
Clear filter

Code to load DreamZero model checkpoints and run evaluation on DROID-sim and Genie Sim 3.0

Python 461 10 Updated Feb 5, 2026

Causal video-action world model for generalist robot control

Python 537 21 Updated Feb 6, 2026

Advancing Open-source World Models

Python 2,610 208 Updated Feb 2, 2026

Gen3R: 3D Scene Generation Meets Feed-Forward Reconstruction

Python 172 3 Updated Jan 14, 2026

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

342 7 Updated Jan 5, 2026

InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation​

Jupyter Notebook 325 20 Updated Feb 3, 2026

Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.

Python 101 4 Updated Jan 14, 2026

[AAAI 2026] WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving

29 Updated Dec 23, 2025

Official code of Motus: A Unified Latent Action World Model

Python 683 20 Updated Jan 5, 2026

MemEvolve & EvolveLab

Python 159 21 Updated Dec 23, 2025

Code for the Molmo2 Vision-Language Model

153 4 Updated Dec 16, 2025

Towards Scalable Pre-training of Visual Tokenizers for Generation

Python 439 10 Updated Dec 16, 2025

Visual Geometry Transformer for Autonomous Driving

Python 181 8 Updated Dec 19, 2025

[ICLR 2026] The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"

C++ 498 36 Updated Feb 2, 2026

[CVPR 2025] Prompt Depth Anything

Python 1,049 63 Updated Jan 29, 2026

HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency

Python 1,109 90 Updated Jan 13, 2026

Native and Compact Structured Latents for 3D Generation

Python 3,468 328 Updated Jan 10, 2026

[Tutorial] Few-Step Distillation for Text-to-Image Generation: A Practical Guide

Python 338 22 Updated Dec 31, 2025

Official code of “MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning”

118 5 Updated Jan 31, 2026

Official Implementation of Particulate: Feed-Forward 3D Object Articulation

Python 107 6 Updated Jan 25, 2026

Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder".

Python 130 7 Updated Dec 18, 2025

The official implementation of The paper "Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation"

Python 95 1 Updated Dec 28, 2025

Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?

Python 216 9 Updated Dec 15, 2025

Repository of the survey: Progressive Robustness-Aware World Models in Autonomous Driving: A Review and Outlook

15 1 Updated Dec 15, 2025

Code release for https://wonderzoom.github.io/

152 4 Updated Dec 11, 2025

A V2V framework that translates human interaction videos into robot manipulation videos.

22 1 Updated Dec 12, 2025

RynnVLA-002: A Unified Vision-Language-Action and World Model

Python 875 49 Updated Dec 2, 2025

[ICLR 2026] Astra : General Interactive World Model with Autoregressive Denoising"

Python 208 5 Updated Feb 2, 2026

🌐 WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World

Python 178 14 Updated Jan 18, 2026

[NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"

Python 93 3 Updated Dec 21, 2025
Next