Stars
DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
[NeurIPS 2025] RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
[ICCV 2023] VAD: Vectorized Scene Representation for Efficient Autonomous Driving
[ICLR 2025] DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
OpenEMMA, a permissively licensed open source "reproduction" of Waymo’s EMMA model.
Collect some World Models for Autonomous Driving (and Robotic, etc.) papers.
[ICCV 2025] DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation
Official Code for Epona: Autoregressive Diffusion World Model for Autonomous Driving (ICCV 2025)
[CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving
Open-source simulator for autonomous driving research.
[ICLR 2025 Oral] The official implementation of "Diffusion-Based Planning for Autonomous Driving with Flexible Guidance"
A Unified Framework for scalable Vehicle Trajectory Prediction, ECCV 2024
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
A curated list of awesome autoregressive papers in Generative AI
Fast and memory-efficient exact attention
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
[ICCV 2025] Official code of DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
OctGPT: Octree-based Multiscale Autoregressive Models for 3D Shape Generation [SIGGRAPH 2025]
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
HoloPart: Generative 3D Part Amodal Segmentation
Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
A Modular Framework for 3D Generation and Beyond [WIP]