AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models
[Paper] (https://arxiv.org/pdf/2503.20804)
AED enables automatic, effective, and diverse vulnerability discovery in autonomous driving policies. By introducing LLMs-based automatic reward design and preference-based reinforcement learning, AED achieves over 2x improvement in vulnerability diversity and a 10%-70% increase in effective vulnerability rate across various traffic environments and tested policies.
Feel free to star the repo or cite the paper if you find it interesting.
@article{qiu2025aed,
title={AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models},
author={Qiu, Le and Xu, Zelai and Tan, Qixin and Tang, Wenhao and Yu, Chao and Wang, Yu},
journal={arXiv preprint arXiv:2503.20804},
year={2025}
}The vulnerability discovery framework used in AED is built upon previous work VDARS, a multi-agent reinforcement learning framework that discovers vulnerabilities attributable to autonomous driving policies.
Tested on CUDA == 10.1
conda create -n marl
conda activate marl
pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html
cd onpolicy
pip install -e .
pip install wandb icecream setproctitle gym seaborn tensorboardX slackweb psutil slackweb pyastar2d einops-
config.py: contains all hyper-parameters
-
default: use GPU, chunk-version recurrent policy and shared policy
-
other important hyperparameters:
- use_centralized_V: Centralized training (MA) or Centralized training (I)
- use_single_network: share base or not
- use_recurrent_policy: rnn or mlp
- use_eval: turn on evaluation while training, if True, u need to set "n_eval_rollout_threads"
- wandb_name: For example, if your wandb link is https://wandb.ai/mapping, then you need to change wandb_name to "mapping".
- user_name: only control the program name shown in "nvidia-smi".
- training script:
./train_highway.sh - rendering script
./render_highway.sh
- training script:
./train_roundabout.sh - rendering script
./render_roundabout.sh
The automatic reward design framework for different vulnerability types in AED is based on Eureka.
Eureka requires Python ≥ 3.8. We have tested on Ubuntu 20.04 and 22.04.
- Eureka currently uses OpenAI API for language model queries. You need to have an OpenAI API key to use Eureka here/. Then, set the environment variable in your terminal
pip install openai
export OPENAI_API_KEY= "YOUR_API_KEY"
Navigate to the onpolicy/scripts/train directory and run:
python train_eureka_human.py env={environment} collision={collision_type} iteration={num_iterations} sample={num_samples}
{environment}is the task to perform. Options arehighway,roundabout.{collision_type}is the type of collision you want to generate. Options are listed inonpolicy/scripts/train/cfg/collision{num_samples}is the number of reward samples to generate per iteration. Default value is6.{num_iterations}is the number of Eureka iterations to run. Default value is5.
Each run will create a timestamp folder in onpolicy/scripts/train/outputs that saves the Eureka log as well as all intermediate reward functions and associated policies.
Other command line parameters can be found in onpolicy/scripts/train/cfg/config.yaml.
- Collect trajectory data
After training the RL agent with reward functions selected by Eureka, navigate to
onpolicy/scriptsand running./render_highway.sh. This will generate trajectory data in text format, stored invulnerability/Eureka. - Collect preference data
Navigate to
Preprocess_Datadirectory and runpython data_collect.py. Make sure to configure the paths in the script:data_sample_1_nameis the path to Positive Data.data_sample_2_nameanddata_sample_3_nameis the path to Negative Data.
- Run
python train_preference.pyto train the preference reward model. - Run
python test_preference.pyto visualize the reward distribution.
- Navigate to
onpolicy/envs/highway/Highway_Env.py.
- Set the
reward_pathto your trained preference reward model. - Set the
compute_reward()function as the LLMs-driven reward model.
- Navigate to
onpolicy/scriptsand run./train_highway.shto start RL training.