Stars
official repo of "The DAWN of World-Action Interactive Models"
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model
Vega: Learning to Drive with Natural Language Instructions
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
[NeurIPS 2025 spotlight] Official implementation for "FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving"
Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence
Vision–Language–Action models for Autonomous Driving (VLA4AD) resources, serving as the companion repository to the survey paper “A Survey on Vision–Language–Action Models for Autonomous Driving”.
🌐 3D and 4D World Modeling: A Survey
Official implementation of paper "Unified World Models: Memory-Augmented Planning and Foresight for Visual Navigation"
DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving (ICLR 2026)
Official implementation of "From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction"
Ongoing research training transformer models at scale
[NeurIPS 2025] 𝒳-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
A curated collection of papers on E2E-AD, aimed at researchers, engineers, and enthusiasts in the field of autonomous driving systems. This repository provides a comprehensive selection of papers, …
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Framework for novel view synthesis of LiDAR and Camera data
[CVPR 2026] VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models
[ECCV 2022] ST-P3, an end-to-end vision-based autonomous driving framework via spatial-temporal feature learning.
[CVPR 2025] Gaussian World Model for Streaming 3D Occupancy Prediction
[ICCV 2025] Official implementation of the paper “MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”