Reinforcement Learning Project

Technische Universität Darmstadt, Winter Semester 2018/2019

Supervision: Jan Peters, Samuele Tosatto

Authors

Algorithms

A3C
PILCO

Platforms

Cartpole Stabilization (Further info)
Cartpole Swing-up (Further info)
Qube/Furuta Pendulum (Further info)

Installation

The following Python packages are required:

autograd
baselines - For installation details see: https://github.com/openai/baselines
dill
GPy
gym
matplotlib
matplotlib2tikz (Optional in case PILCO plots should be saved)
numpy
pytorch
scipy
tensorboard
tensorboardX
tensorflow
torchvision
quanser_robots

The following Linux packages are required:

ffmpeg

The following is required for PILCO Test cases:

Octave installation
oct2py (python package)

We also offer to install all required packages directly through our anaconda environment export.

For creating a new anaconda environment based on a YML-file use:

conda env create --name my_env_name --file path/to/conda_env.yml python=3.6.5

Experiments

Please be aware that the Quanser environments are still subject to change and results or policies might not be reproducible or applicable anymore. The latest Quanser version introduced different constraints for the cartpole environment which can cause issues.

We added a small subset of experiment runs, which we found useful in order to get a better feeling for hyper-parameters and the algorithm in general. This allows to compare different hyper-parameter settings, performance and sample efficiency.

More details can be found here.
In order to run experiments with A3C or PILCO, please check the corresponding README.

Log files for all runs will be saved to ./experiments/logs/.

Citation

Our comprehensive report can be found here

@software{otto_czech_2019,  
	title = {Project Lab Reinforcement Learning, {TU} Darmstadt, {WS}18/19: {ottofabian}/{RL}-Project},  
	url = {https://github.com/ottofabian/RL-Project},  
	shorttitle = {Project Lab Reinforcement Learning, {TU} Darmstadt, {WS}18/19},  
	author = {{Otto, Fabian and Czech, Johannes}},  
	urldate = {2019-03-15},  
	date = {2019-03-15},  
}

Name		Name	Last commit message	Last commit date
Latest commit History 506 Commits
a3c		a3c
experiments		experiments
pilco		pilco
resources		resources
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Czech_Otto_Lab_Report.pdf		Czech_Otto_Lab_Report.pdf
LICENSE		LICENSE
README.md		README.md
a3c_runner.py		a3c_runner.py
conda_env.yml		conda_env.yml
pilco_runner.py		pilco_runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning Project

Authors

Algorithms

Platforms

Installation

Experiments

Citation

About

Releases

Packages

Contributors 2

Languages

License

ottofabian/RL-Project

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning Project

Authors

Algorithms

Platforms

Installation

Experiments

Citation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages