Policy Gradient Doing some experiments to learn about policy gradient methods. REINFORCE with discrete and continuous actions are implemented in discrete_actions.py and cont_actions.py. DDPG with continuous actions is implemented in ddpg.py