My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.
-
Updated
Jan 15, 2018 - Python
My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.
PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
PyTorch implementation of our work: "Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning"
This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.
Sample Policy Gradient
This repository contains all of the Reinforcement Learning-related projects I've worked on. The projects are part of the graduate course at the University of Tehran.
Q-learning is an off-policy temporal-difference control algorithm. It learns the value of the optimal action, independent of the action actually taken by the agent.
Contains PyTorch Implementation of the following off policy actor critic algorithms
Autonomous Parking with Deep Reinforcement Learning Custom MDP Development
Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning
PyTorch implementation of our work: "Where is the Grass Greener? Revisiting Generalized Policy Iteration for Offline Reinforcement Learning"
PyTorch implementation of our work: "Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning"
PROJECT MIGRATED TO CODEBERG - Reinforcement Learning in Multiplicative Domains
(Neurocomputing) Source-code of the paper: Provable Generalization of Clipped Double Q-Learning for Variance Reduction and Sample Efficiency
TensorFlow implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"
Experimenting with State-Action Distance RL
Containing a custom-built Reinforcement Learning environment and implementations of key RL algorithms like Q-learning and SARSA, tested in scenarios such as a drone navigation challenge and the Frozen Lake environment.
Applying Deep Reinforcement Learning for Gridworld Best-Route Navigation
Collection of codes pertaining to my research in model-free RL algorithms.
Fine-Tuning GPT-2, RoBERTa, and PPO Architecture for Enhancing Reliability of General-Purpose Chatbot
Add a description, image, and links to the off-policy topic page so that developers can more easily learn about it.
To associate your repository with the off-policy topic, visit your repo's landing page and select "manage topics."