This is the official PyTorch implementation of the DualOT algorithm from the paper "Policy Learning from Few Expert Videos via Long-Short Term Optimal Transport Reward Coordination".
Before building the Docker image, it is necessary to ensure that you are in the Dockerfile directory.
docker build -t dualot:v1 .After creating a Docker container using this image, enter the Docker environment and activate the Python environment using conda activate dualot. Then, you can run the algorithm.
sudo apt-get update
sudo apt-get install libosmesa6-dev libgl1-mesa-glx libglfw3- Install Other dependencies:
conda create -y -n dualot python=3.9.19
conda install -y pytorch=2.2.2 torchvision=0.17.2 torchaudio=2.2.2 -c pytorch -c nvidia
pip install -r Dockerfile/downloads/requirements.txt
pip install dm_control PyOpenGL-accelerate-
For the Meta-World Benchmark, you can either download the expert demonstration dataset directly from this link, or generate the new expert demonstration dataset using the
metaworld_generate_expert/generate_demo.py. The dataset is placed in the folder namedIL/expert_demos/metaworld/${metaworld_task_name}. -
For the Deepmind Control Suite (DMC) Benchmark, we use the DrQv2 algorithm to train the agent on the corresponding task, and then use the trained agent to collect 10 pieces of expert demonstration data. The dataset is placed in the folder named
IL/expert_demos/dmc/${dmc_task_name}.
For the Metaword Benchmark, we use the following similar instructions to run the code:
# Make sure the command line is located under the "IL" folder.
python train.py \
root_dir=DualOT/IL \
seed=2 \
suite=metaworld \
suite/metaworld_task=basketball \
obs_type=pixels \
agent=dualotFor the DMC Benchmark, we use the following similar instructions to run the code:
# Make sure the command line is located under the "IL" folder.
python train.py \
root_dir=DualOT/IL \
seed=2 \
suite=dmc \
suite/dmc_task=quadruped_run \
suite.num_train_frames=500000 \
obs_type=pixels \
agent=dualot- Our codebase is built upon the ADS codebase and the TemporalOT codebase.
- The testing environments are based on Metaworld and DeepMind Control Suite.