AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models

[Paper] (https://arxiv.org/pdf/2503.20804)

AED enables automatic, effective, and diverse vulnerability discovery in autonomous driving policies. By introducing LLMs-based automatic reward design and preference-based reinforcement learning, AED achieves over 2x improvement in vulnerability diversity and a 10%-70% increase in effective vulnerability rate across various traffic environments and tested policies.

Feel free to star the repo or cite the paper if you find it interesting.

@article{qiu2025aed,
  title={AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models},
  author={Qiu, Le and Xu, Zelai and Tan, Qixin and Tang, Wenhao and Yu, Chao and Wang, Yu},
  journal={arXiv preprint arXiv:2503.20804},
  year={2025}
}

VDARS: Multi-Agent Vulnerability Discovery for Autonomous Driving Policy

The vulnerability discovery framework used in AED is built upon previous work VDARS, a multi-agent reinforcement learning framework that discovers vulnerabilities attributable to autonomous driving policies.

1. Installation

1.1 Environment setup

Tested on CUDA == 10.1

   conda create -n marl
   conda activate marl
   pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html
   cd onpolicy
   pip install -e . 
   pip install wandb icecream setproctitle gym seaborn tensorboardX slackweb psutil slackweb pyastar2d einops

1.2 Hyperparameters

config.py: contains all hyper-parameters
default: use GPU, chunk-version recurrent policy and shared policy
other important hyperparameters:
- use_centralized_V: Centralized training (MA) or Centralized training (I)
- use_single_network: share base or not
- use_recurrent_policy: rnn or mlp
- use_eval: turn on evaluation while training, if True, u need to set "n_eval_rollout_threads"
- wandb_name: For example, if your wandb link is https://wandb.ai/mapping, then you need to change wandb_name to "mapping".
- user_name: only control the program name shown in "nvidia-smi".

2. HighWay Environment

training script: ./train_highway.sh
rendering script ./render_highway.sh

3. Roundabout Environment

training script: ./train_roundabout.sh
rendering script ./render_roundabout.sh

Eureka: Human-Level Reward Design via Coding Large Language Models

[Website] [arXiv] [PDF]

The automatic reward design framework for different vulnerability types in AED is based on Eureka.

Installation in AED

Eureka requires Python ≥ 3.8. We have tested on Ubuntu 20.04 and 22.04.

Eureka currently uses OpenAI API for language model queries. You need to have an OpenAI API key to use Eureka here/. Then, set the environment variable in your terminal

pip install openai
export OPENAI_API_KEY= "YOUR_API_KEY"

Getting Started

Navigate to the onpolicy/scripts/train directory and run:

python train_eureka_human.py env={environment} collision={collision_type} iteration={num_iterations} sample={num_samples}

{environment} is the task to perform. Options are highway,roundabout.
{collision_type} is the type of collision you want to generate. Options are listed in onpolicy/scripts/train/cfg/collision
{num_samples} is the number of reward samples to generate per iteration. Default value is 6.
{num_iterations} is the number of Eureka iterations to run. Default value is 5.

Each run will create a timestamp folder in onpolicy/scripts/train/outputs that saves the Eureka log as well as all intermediate reward functions and associated policies.

Other command line parameters can be found in onpolicy/scripts/train/cfg/config.yaml.

Preference-Based Effectiveness Enhancement

Step 1: Learn the reward model

Collect trajectory data After training the RL agent with reward functions selected by Eureka, navigate to onpolicy/scripts and running ./render_highway.sh. This will generate trajectory data in text format, stored in vulnerability/Eureka.
Collect preference data Navigate to Preprocess_Data directory and run python data_collect.py. Make sure to configure the paths in the script:
- data_sample_1_name is the path to Positive Data.
- data_sample_2_name and data_sample_3_name is the path to Negative Data.
Run python train_preference.py to train the preference reward model.
Run python test_preference.py to visualize the reward distribution.

Step 2: Incorporate the reward model into RL training

Navigate to onpolicy/envs/highway/Highway_Env.py.

Set the reward_path to your trained preference reward model.
Set the compute_reward() function as the LLMs-driven reward model.

Navigate to onpolicy/scripts and run ./train_highway.sh to start RL training.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.idea		.idea
Preprocess_Data		Preprocess_Data
onpolicy		onpolicy
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models

VDARS: Multi-Agent Vulnerability Discovery for Autonomous Driving Policy

1. Installation

1.1 Environment setup

1.2 Hyperparameters

2. HighWay Environment

3. Roundabout Environment

Eureka: Human-Level Reward Design via Coding Large Language Models

Installation in AED

Getting Started

Preference-Based Effectiveness Enhancement

Step 1: Learn the reward model

Step 2: Incorporate the reward model into RL training

About

Uh oh!

Releases

Packages

Languages

License

thu-nics/AED

Folders and files

Latest commit

History

Repository files navigation

AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models

VDARS: Multi-Agent Vulnerability Discovery for Autonomous Driving Policy

1. Installation

1.1 Environment setup

1.2 Hyperparameters

2. HighWay Environment

3. Roundabout Environment

Eureka: Human-Level Reward Design via Coding Large Language Models

Installation in AED

Getting Started

Preference-Based Effectiveness Enhancement

Step 1: Learn the reward model

Step 2: Incorporate the reward model into RL training

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages