AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification

AMPED (Adaptive Multi-objective Projection for Exploration and skill Diversification) is a skill-based reinforcement learning algorithm designed to explicitly balance exploration and skill diversity. AMPED integrates entropy and RND based exploration with contrastive skill separation, and resolves conflicting learning signals using gradient surgery.

For more information, please see our project webpage.

🎥 Demonstrations

AMPED Demonstrations

📖 Installation

Before getting started, make sure the following requirements are met:

Conda (for environment management)
(Optional, only in case you want to run experiments through gpu) GPU with CUDA 11.1 and cuDNN 8 installed

conda env create -f conda_env.yml

After the installation ends you can activate your environment with

conda activate amped

🚀 Usage

The main implementation of AMPED can be found in agent/amped.py.
Implementations of baseline agents such as APT, BeCL, CeSD, CIC, ComSD, DIAYN, and RND are also available in the directory. The SAC-based skill selector is implemented in skill_selector/sac.py.

AMPED supports training agents across various domains and tasks through a two-stage process:

Pre-training: Learn diverse and meaningful skills using unsupervised objectives.
- Run via pretrain.py
- Pretraining will produce several agent snapshots after training for 100k, 500k, 1M, and 2M frames and snapshots will be stored in ./models/states/<domain>/<agent>/<seed>/ . (i.e. the snapshots path is ./models/states/walker/amped/3/ ).
Fine-tuning: Adapt the pretrained policy to a downstream task using its extrinsic reward.
- Run via finetune.py or finetunev2.py
- Use finetunev2.py to enable the skill selector, which dynamically chooses the best skill to execute at each time step.
- During fine-tuning, the pretrained agent is initialized from a saved snapshot and continues learning in a reward-driven setting.

Example Usage

# Pre-train AMPED on walker domain
python pretrain.py agent=amped domain=walker seed=3

# Finetune AMPED on walker_stand task
python finetune.py task=walker_stand obs_type=states agent=amped reward_free=false seed=3 domain=walker snapshot_ts=2000000

# Pre-train APT on jaco domain
python pretrain.py agent=apt domain=jaco seed=100

# Finetune APT on jaco_reach_top_left task
python finetune.py task=jaco_reach_top_left obs_type=states agent=apt reward_free=false seed=100 domain=jaco snapshot_ts=2000000

# Finetune AMPED on walker_stand task with skill selector
python finetunev2.py task=walker_stand obs_type=states agent=amped reward_free=false seed=3 domain=walker snapshot_ts=2000000

🌍 Available Domains

AMPED supports the following domains and associated tasks:

Domain	Tasks
`walker`	`stand`, `walk`, `run`, `flip`
`quadruped`	`walk`, `run`, `stand`, `jump`
`jaco`	`reach_top_left`, `reach_top_right`, `reach_bottom_left`, `reach_bottom_right`

🧠 Available Baselines

We support the following baseline agents:

Baseline
`APT`: Behavior From the Void: Unsupervised Active Pre-Training (NeurIPS 2021)
`BeCL`: Behavior Contrastive Learning for Unsupervised Skill Discovery (ICML 2023)
`CeSD`: Constrained Ensemble Exploration for Unsupervised Skill Discovery (ICML 2024)
`CIC`: Contrastive Intrinsic Control for Unsupervised Skill Discovery
`ComSD`: Balancing Behavioral Quality and Diversity in Unsupervised Skill Discovery (IEEE T-Cybernetics 2025)
`DIAYN`: Diversity is All You Need: Learning Skills without a Reward Function
`RND`: Exploration by Random Network Distillation

📊 Monitoring

Training logs are saved in the exp_local directory. To launch TensorBoard, run:

tensorboard --logdir exp_local

For logging with Weights & Biases (wandb):

Set use_wandb: true and provide your WandB API key via the wandb_key field in the config.yaml file.
Alternatively, you can enable logging by passing --use_wandb true and --wandb_key <your_wandb_key> as command-line arguments.

Console output is also provided in this format:

| train | F: 6000 | S: 3000 | E: 6 | L: 1000 | R: 5.5177 | FPS: 96.7586 | T: 0:00:42

Where:

F: total number of environment frames
S: total number of agent steps
E: total number of episodes
R: episode return
FPS: training throughput (frames per second)
T: total training time

Misc

We adopt an agent-specific code structure to avoid unnecessary complexity. Instead of sharing a unified codebase across all agents, we maintain a separate set of files for each agent to improve clarity and modularity.
Please note that this codebase may not exactly reproduce the results reported in the paper due to potential human errors during code migration. If you observe any discrepancies in performance, feel free to reach out—we’d appreciate your feedback.

Acknowledgements

This codebase is built on top of the Unsupervised Reinforcement Learning Benchmark (URLB) codebase.

The implementation of CeSD is adapted from its CeSD Repository, BeCL from the BeCL Repository, CIC from the CIC Repository, and ComSD is implemented based on the ComSD Repository.

License

This project is licensed under the MIT License -- see the LICENSE file for details. Note that the repository relies on third-party libraries subject to their respective licenses.

Citation

@article{AMPED,
  title={AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification},
  author={Cho, Geonwoo and Lee, Jaemoon and Im, Jaegyun and Lee, Subi and Lee, Jihwan and Kim, Sundong},
  journal={arXiv preprint arXiv:2506.05980},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
agent		agent
assets		assets
custom_dmc_tasks		custom_dmc_tasks
skill_selector		skill_selector
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
THIRD_PARTY_LICENSES		THIRD_PARTY_LICENSES
conda_env.yml		conda_env.yml
dmc.py		dmc.py
dmc_benchmark.py		dmc_benchmark.py
finetune.py		finetune.py
finetune.yaml		finetune.yaml
finetunev2.py		finetunev2.py
finetunev2.yaml		finetunev2.yaml
logger.py		logger.py
pretrain.py		pretrain.py
pretrain.yaml		pretrain.yaml
replay_buffer.py		replay_buffer.py
utils.py		utils.py
video.py		video.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification

🎥 Demonstrations

📖 Installation

🚀 Usage

Example Usage

🌍 Available Domains

🧠 Available Baselines

📊 Monitoring

Misc

Acknowledgements

License

Citation

About

Uh oh!

Languages

License

Cho-Geonwoo/AMPED

Folders and files

Latest commit

History

Repository files navigation

AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification

🎥 Demonstrations

📖 Installation

🚀 Usage

Example Usage

🌍 Available Domains

🧠 Available Baselines

📊 Monitoring

Misc

Acknowledgements

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages