imitate video from here https://www.bilibili.com/video/BV1Ge4y1i7L6/
imitate source code from here https://github.com/lansinuote/Simple_Reinforcement_Learning and here https://github.com/lansinuote/More_Simple_Reinforcement_Learning/tree/main
我的评价是从PPO开始,之前看v1, 之后看v2, v2有更多的文字描述
v3 is from Hands-on RL, instructed from here https://hrl.boyuai.com/chapter/, and imitated the code from here https://github.com/boyu-ai/Hands-on-RL. Note that this code is from a very old version, so I changed the code correspondingly to support python 3.12.
see requirements.txt