[NeurIPS 2024] MADiff: Offline Multi-agent Learning with Diffusion Models

This is the official implementation of "MADiff: Offline Multi-agent Learning with Diffusion Models" published in NeurIPS 2024.

Performances

We omit the standard deviation of the results for brevity. The full results can be found in our paper.

Multi-agent Particle Environment (MPE)

The peformances on MPE datasets released in OMAR paper. The results are averaged over 5 random seeds.

Dataset	Task	BC	MA-ICQ	MA-TD3+BC	MA-CQL	OMAR	MADiff-D	MADiff-C*
Expert	Spread	35.0	104.0	108.3	98.2	114.9	95.0	116.7
Md-Replay	Spread	10.0	13.6	15.4	20.0	37.9	30.3	42.2
Medium	Spread	31.6	29.3	29.3	34.1	47.9	64.9	58.2
Random	Spread	-0.5	6.3	9.8	24.0	34.4	6.9	4.3
Expert	Tag	40.0	113.0	115.2	93.9	116.2	120.9	167.6
Md-Replay	Tag	0.9	34.5	28.7	24.8	47.1	62.3	95.0
Medium	Tag	22.5	63.3	65.1	61.7	66.7	77.2	132.9
Random	Tag	1.2	2.2	5.7	5.0	11.1	3.2	10.7
Expert	World	33.0	109.5	110.3	71.9	110.4	122.6	174.0
Md-Replay	World	2.3	12.0	17.4	29.6	42.9	57.1	83.0
Medium	World	25.3	71.9	73.4	58.6	74.6	123.5	158.2
Random	World	-2.4	1.0	2.8	0.6	5.9	2.0	8.1

Multi-agent Mujoco (MA-Mujoco)

The peformances on MA-Mujoco datasets released in off-the-grid MARL benchmark. The results are averaged over 5 random seeds.

Dataset	Task	BC	MA-TD3+BC	OMAR	MADiff-D	MADiff-C*
Good	2halfcheetah	6846	7025	1434	8246	8514
Medium	2halfcheetah	1627	2561	1892	2207	2203
Poor	2halfcheetah	465	736	384	759	760
Good	2ant	2697	2922	464	2946	3069
Medium	2ant	1145	744	799	1211	1243
Poor	2ant	954	1256	857	946	1038
Good	4ant	2802	2628	344	3080	3068
Medium	4ant	1617	1843	929	1649	1871
Poor	4ant	1033	1075	518	1295	1353

StarCraft Multi-Agent Challenge (SMAC)

The peformances on SMAC datasets released in off-the-grid MARL benchmark. The results are averaged over 5 random seeds.

Dataset	Task	BC	QMIX	MA-ICQ	MA-CQL	MADT	MADiff-D	MADiff-C*
Good	3m	16.0	13.8	18.8	19.6	19.1	19.3	19.9
Medium	3m	8.2	17.3	18.1	18.9	15.8	17.3	18.1
Poor	3m	4.4	10.0	14.4	5.8	4.4	9.6	9.5
Good	2s3z	18.2	5.9	19.6	19.0	19.3	19.6	19.7
Medium	2s3z	12.3	5.2	17.2	14.3	15.0	17.4	17.6
Poor	2s3z	6.7	3.8	12.1	10.1	7.0	9.8	10.4
Good	5m6m	16.6	8.0	16.3	13.8	16.7	17.8	18.0
Medium	5m6m	12.4	12.0	15.3	17.0	16.6	17.3	18.0
Poor	5m6m	7.5	10.7	9.4	10.4	7.8	8.9	10.3
Good	8m	16.7	4.6	19.6	11.3	18.4	19.2	19.8
Medium	8m	10.7	13.9	18.6	16.8	18.5	18.9	19.4
Poor	8m	5.3	6.0	10.8	4.6	4.7	5.1	5.1

* MADiff-C is not meant to be a fair comparison with baseline methods but to show if MADiff-D fills the gap for coordination without global information.

Setup

Installation

sudo apt-get update
sudo apt-get install libssl-dev libcurl4-openssl-dev swig
conda create -n madiff python=3.8
conda activate madiff
pip install torch==1.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt

Setup MPE

We use the MPE dataset from OMAR. The dataset download link and instructions can be found in OMAR's repo. Since their BaiduPan download links might be inconvenient for non-Chinese users, we maintain an anonymous mirror repo in OSF for acquiring the dataset.

The downloaded dataset should be placed under diffuser/datasets/data/mpe.

Install MPE environment:

pip install -e third_party/multiagent-particle-envs
pip install -e third_party/ddpg-agent

Setup MA-Mujoco

Install MA-Mujoco:

pip install -e third_party/multiagent_mujoco

We use the MA-Mujoco dataset from off-the-grid MARL. We preprocess the dataset to concatenate trajectories to full episodes and save them as .npy files for easier loading. The original dataset can be downloaded from the Huggingface repo.

The downloaded dataset should be unzipped and placed under diffuser/datasets/data/mamujoco.

Install off-the-grid MARL and transform the original dataset.

pip install -r ./third_party/og-marl/install_environments/requirements/mamujoco.txt
pip install -e ./third_party/og-marl
python scripts/transform_og_marl_dataset.py --env_name mamujoco --map_name <map> --quality <dataset>

Setup SMAC

Run scripts/smac.sh to install StarCraftII.

Install SMAC:

pip install git+https://github.com/oxwhirl/smac.git

We use the SMAC dataset from off-the-grid MARL. We preprocess the dataset to concatenate trajectories to full episodes and save them as .npy files for easier loading. The original dataset can be downloaded from the Huggingface repo.

The downloaded dataset should be unzipped and placed under diffuser/datasets/data/smac.

Install off-the-grid MARL and transform the original dataset.

pip install -r ./third_party/og-marl/install_environments/requirements/smacv1.txt
pip install -e ./third_party/og-marl
python scripts/transform_og_marl_dataset.py --env_name smac --map_name <map> --quality <dataset>

Training and Evaluation

To start training, run the following commands

# multi-agent particle environment
python run_experiment.py -e exp_specs/mpe/<task>/mad_mpe_<task>_attn_<dataset>.yaml  # CTCE
python run_experiment.py -e exp_specs/mpe/<task>/mad_mpe_<task>_ctde_<dataset>.yaml  # CTDE
# ma-mujoco
python run_experiment.py -e exp_specs/mamujoco/<task>/mad_mamujoco_<task>_attn_<dataset>_history.yaml  # CTCE
python run_experiment.py -e exp_specs/mamujoco/<task>/mad_mamujoco_<task>_ctde_<dataset>_history.yaml  # CTDE
# smac
python run_experiment.py -e exp_specs/smac/<map>/mad_smac_<map>_attn_<dataset>_history.yaml  # CTCE
python run_experiment.py -e exp_specs/smac/<map>/mad_smac_<map>_ctde_<dataset>_history.yaml  # CTDE

To evaluate the trained model, first replace the log_dir with those need to be evaluated in exp_specs/eval_inv.yaml and run

python run_experiment.py -e exp_specs/eval_inv.yaml

Citation

@article{zhu2023madiff,
  title={MADiff: Offline Multi-agent Learning with Diffusion Models},
  author={Zhu, Zhengbang and Liu, Minghuan and Mao, Liyuan and Kang, Bingyi and Xu, Minkai and Yu, Yong and Ermon, Stefano and Zhang, Weinan},
  journal={arXiv preprint arXiv:2305.17330},
  year={2023}
}

Acknowledgements

The codebase is built upon decision-diffuser repo and ILSwiss.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
assets/images		assets/images
diffuser		diffuser
exp_specs		exp_specs
run_scripts		run_scripts
scripts		scripts
tests		tests
third_party		third_party
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_experiment.py		run_experiment.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[NeurIPS 2024] MADiff: Offline Multi-agent Learning with Diffusion Models

Performances

Multi-agent Particle Environment (MPE)

Multi-agent Mujoco (MA-Mujoco)

StarCraft Multi-Agent Challenge (SMAC)

Setup

Installation

Setup MPE

Setup MA-Mujoco

Setup SMAC

Training and Evaluation

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

zbzhu99/madiff

Folders and files

Latest commit

History

Repository files navigation

[NeurIPS 2024] MADiff: Offline Multi-agent Learning with Diffusion Models

Performances

Multi-agent Particle Environment (MPE)

Multi-agent Mujoco (MA-Mujoco)

StarCraft Multi-Agent Challenge (SMAC)

Setup

Installation

Setup MPE

Setup MA-Mujoco

Setup SMAC

Training and Evaluation

Citation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages