Policy Learning from Few Expert Videos via Long-Short Term Optimal Transport Reward Coordination

This is the official PyTorch implementation of the DualOT algorithm from the paper "Policy Learning from Few Expert Videos via Long-Short Term Optimal Transport Reward Coordination".

Environment Setup

Build the experimental environment using Docker:

Before building the Docker image, it is necessary to ensure that you are in the Dockerfile directory.

docker build -t dualot:v1 .

After creating a Docker container using this image, enter the Docker environment and activate the Python environment using conda activate dualot. Then, you can run the algorithm.

Build the experimental environment using Conda:

Follow this link to install the Mujoco.
Install the following libraries:

sudo apt-get update
sudo apt-get install libosmesa6-dev libgl1-mesa-glx libglfw3

Install Other dependencies:

conda create -y -n dualot python=3.9.19
conda install -y pytorch=2.2.2 torchvision=0.17.2 torchaudio=2.2.2 -c pytorch -c nvidia
pip install -r Dockerfile/downloads/requirements.txt
pip install dm_control PyOpenGL-accelerate

Prepare Expert Video Dataset

For the Meta-World Benchmark, you can either download the expert demonstration dataset directly from this link, or generate the new expert demonstration dataset using the metaworld_generate_expert/generate_demo.py. The dataset is placed in the folder named IL/expert_demos/metaworld/${metaworld_task_name}.
For the Deepmind Control Suite (DMC) Benchmark, we use the DrQv2 algorithm to train the agent on the corresponding task, and then use the trained agent to collect 10 pieces of expert demonstration data. The dataset is placed in the folder named IL/expert_demos/dmc/${dmc_task_name}.

Training Agent

For the Metaword Benchmark, we use the following similar instructions to run the code:

# Make sure the command line is located under the "IL" folder.
python train.py \
    root_dir=DualOT/IL \
    seed=2 \
    suite=metaworld \
    suite/metaworld_task=basketball \
    obs_type=pixels \
    agent=dualot

For the DMC Benchmark, we use the following similar instructions to run the code:

# Make sure the command line is located under the "IL" folder.
python train.py \
    root_dir=DualOT/IL \
    seed=2 \
    suite=dmc \
    suite/dmc_task=quadruped_run \
    suite.num_train_frames=500000 \
    obs_type=pixels \
    agent=dualot

Acknowlegments

Our codebase is built upon the ADS codebase and the TemporalOT codebase.
The testing environments are based on Metaworld and DeepMind Control Suite.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Dockerfile		Dockerfile
IL		IL
metaworld_generate_expert		metaworld_generate_expert
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Policy Learning from Few Expert Videos via Long-Short Term Optimal Transport Reward Coordination

Environment Setup

Build the experimental environment using Docker:

Build the experimental environment using Conda:

Prepare Expert Video Dataset

Training Agent

Acknowlegments

About

Uh oh!

Releases

Packages

Languages

License

RLShi/DualOT

Folders and files

Latest commit

History

Repository files navigation

Policy Learning from Few Expert Videos via Long-Short Term Optimal Transport Reward Coordination

Environment Setup

Build the experimental environment using Docker:

Build the experimental environment using Conda:

Prepare Expert Video Dataset

Training Agent

Acknowlegments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages