Status: Under construction.
Amca is an RL-based Backgammon agent.
| Dependency | Version Tested On |
|---|---|
| Ubuntu | 16.04 |
| Python | 3.6.8 |
| numpy | 1.15.4 |
| gym | 0.10.9 |
| Stable Baselines | 2.4.0a |
This project aims to design Backgammon as a reinforcement learning problem, and gauge the performance of common deep reinforcement learning algorithms. This is done by training and gauging the performance of three popular and powerful RL algorithms:
- Deep Q Network (Mnih et. al)
- Proximal Policy Optimization (Schulman et. al)
- Soft Actor-Critic (Haarnoja et. al)
- Sarsa (Rummery and Niranjan)
The testing is done with the default parameters and implementations provided by the Stable Baselines library for all the 3 deep RL algorithms. A custom implementation heavily modified from this repo is used for SARSA, and the hyperparameters are given in the SarsaAgent object.
- play.py: to launch a game against a deep RL trained model. For example,
python play.py ppo amca/models/amca.pklwill launch the model calledamca.pklthat was trained using the PPO algorithm. - train.py: to train an deep RL model (with default hyperparameters) to play. For example,
python train.py -n terminator.pkl -a sac -t 1000000will train an agent calledterminator.pklusing the SAC algorithm for 1000000 steps. - sarsa_play.py: to launch a game against a SARSA trained model.
python sarsa_play.py r2d2.pklwill launch the model calledr2d2.pklthat was trained using the SARSA algorithm. - sarsa_train.py: to train a model using SARSA. For example,
python sarsa_train.py jarvis.pkl -g 10000will train an agent calledjarvis.pklusing the SARSA algorithm for 10000 games.