This repository collects the python implements of psuedo-codes in "Algorithms for Reinforcement Learning".
| No | Algorithm | Implementation | Environment Name |
|---|---|---|---|
| 1 | Tabular TD(0) | TabularTdZero | FrozenLake-v0 |
| 2 | Every-visit Monte-Carlo | EveryVistMC | FrozenLake-v0 |
| 3 | Tabular TD(λ) | TabularTDLambda | FrozenLake-v0 |
| 4 | TD(λ) w/ function approximation | TDLambdaLinFApp | |
| 5 | GTD2 | ||
| 6 | RLSTD | ||
| 7 | λ-LSPE | ||
| 8 | UCB1 select | ||
| 9 | UCB1 update | ||
| 10 | UCRL2 | ||
| 11 | Finding optimal policy by UCRL2 | ||
| 12 | Tabular Q-learning | TabularQLearning | FrozenLake-v0 |
| 13 | Q-leraning w/ function approximation | QLearningLinFApp | MountainCar-v0 |
| 14 | fitted Q | ||
| 15 | SARSA(λ) | SARSA | MountainCar-v0 |
| 16 | LTSD-Q(λ) | ||
| 17 | LSPI(λ) | ||
| 18 | Actor-critic |