Just for learning
Welcome to clone if you are interested.
PPO: https://github.com/nikhilbarhate99/PPO-PyTorch
==Ubuntu==
apt update
apt install software-properties-common
apt install libfreetype6-dev libsdl-image1.2-dev libsdl-mixer1.2-dev libsdl-ttf2.0-dev libsdl1.2-dev libsmpeg-dev subversion libportmidi-dev ffmpeg libswscale-dev libavformat-dev libavcodec-dev build-essential libssl-dev libffi-dev
add-apt-repository ppa:deadsnakes/ppa
apt install python3.8 python3.8-dev python3.8-venv
python3.8 -m venv env
source env/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
==CUDA==
curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt update
sudo apt install cuda
nvidia-smi
Tricks:
-
n-Steps Rewards
-
GAE
-
GRU RNN Network
- Bidirectional
- 2 layers
- 0.2 dropout
-
CNN
-
Reward Normalization instead of Advantage Normalization
-
Orthogonal weights initialization
-
0 Constant bias initialization
-
ICM Resources: https://zhuanlan.zhihu.com/p/66303476 https://github.com/bonniesjli/icm/blob/master/icm.py https://github.com/chagmgang/pytorch_ppo_rl/blob/master/model.py https://github.com/jcwleo/curiosity-driven-exploration-pytorch/blob/master/model.py https://zhuanlan.zhihu.com/p/66303476 https://zhuanlan.zhihu.com/p/161948260 https://github.com/uvipen/Street-fighter-A3C-ICM-pytorch/blob/4574cbfcbd148ed1d127ae053fe4afe943a18939/src/model.py https://github.com/adik993/ppo-pytorch/blob/master/curiosity/icm.py
Notes: Not much useful. Large value will cause the trainning goes to wrong direction. small value will make no different but trainning much slower.
-
AlphaZero MSTC
- https://github.com/suragnair/alpha-zero-general
- https://github.com/louisnino/RLcode/tree/master/Alpha-Zero
- https://github.com/plkmo/AlphaZero_Connect4
- https://zhuanlan.zhihu.com/p/115867362
- https://github.com/hijkzzz/alpha-zero-gomoku
- https://github.com/junxiaosong/AlphaZero_Gomoku/
- https://github.com/NeymarL/ChineseChess-AlphaZero
- https://github.com/blanyal/alpha-zero
-
RunningMeanStd