Stars
BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving (AAAI-25)
This is the official implementation of UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving
[ECCV 2024] Fully Sparse 3D Occupancy Prediction & RayIoU Evaluation Metric
DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
[NeurIPS 2025] RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
[ICCV 2023 & ICLR 2026] VAD: Vectorized Scene Representation for Efficient Autonomous Driving
[ICLR 2025] DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
OpenEMMA, a permissively licensed open source "reproduction" of Waymo’s EMMA model.
Collect some World Models for Autonomous Driving (and Robotic, etc.) papers.
[ICCV 2025] DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation
Official Code for Epona: Autoregressive Diffusion World Model for Autonomous Driving (ICCV 2025)
[CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving
Open-source simulator for autonomous driving research.
[ICLR 2025 Oral] The official implementation of "Diffusion-Based Planning for Autonomous Driving with Flexible Guidance"
A Unified Framework for scalable Vehicle Trajectory Prediction, ECCV 2024
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
A curated list of awesome autoregressive papers in Generative AI
Fast and memory-efficient exact attention
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets