off-policy

Here are 45 public repositories matching this topic...

mabirck / CS294-DeepRL

My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.

deep-neural-networks reinforcement-learning deep-learning deep-reinforcement-learning pytorch neural-networks policy-gradient reinforcement pytorch-tutorials cs294 on-policy off-policy

Updated Jan 15, 2018
Python

lionelblonde / sam-pytorch-complete-history

Star

PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"

reinforcement-learning pytorch gan imitation-learning gail off-policy

Updated Aug 9, 2021
Python

lionelblonde / liayn-pytorch-complete-history

Star

PyTorch implementation of our work: "Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning"

reinforcement-learning pytorch gan imitation-learning gail off-policy

Updated Apr 19, 2022
Python

amirhosein-mesbah / Reinforcement_learning

Star

This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.

reinforcement-learning deep-reinforcement-learning q-learning gym mdp deeprl bandit-algorithms on-policy off-policy multi-agent-reinforcement-learning distributed-reinforcement-learning network-routing stablebaselines3

Updated Feb 18, 2023
Jupyter Notebook

DjAzDeck / SPG

Star

Sample Policy Gradient

learning algorithm control optimization deep policy continuous action reinforcement deterministic actor-critic model-free off-policy

Updated Mar 8, 2026
Python

narjesno / Reinforcement-Learning

Star

This repository contains all of the Reinforcement Learning-related projects I've worked on. The projects are part of the graduate course at the University of Tehran.

monte-carlo epsilon-greedy policy-gradient sarsa dynamic-programming policy-iteration model-based-rl n-armed-bandit-problem on-policy off-policy double-q-learning model-free-rl n-step-bootstrapping n-step-expected-sarsa n-step-tree-backup ucb-algorithm

Updated Oct 2, 2021
HTML

shaheennabi / Q-Learning-Off-policy

Sponsor

Star

Q-learning is an off-policy temporal-difference control algorithm. It learns the value of the optimal action, independent of the action actually taken by the agent.

off-policy temporal-difference model-free-rl td-control

Updated Dec 29, 2025
Python

SaminYeasar / off_policy_ac

Star

Contains PyTorch Implementation of the following off policy actor critic algorithms

reinforcement-learning pytorch ddpg sac actor-critic mujoco off-policy td3

Updated Aug 5, 2021
Python

ahmadsuleman / Precision-Autonomous-Parking-via-Reward-Augmented-Reinforcement-Learning

Star

Autonomous Parking with Deep Reinforcement Learning Custom MDP Development

simulation-environment unity3d deep-reinforcement-learning autonomous-driving markov-decision-processes on-policy off-policy ml-agents rl-environment autonomous-parking

Updated Feb 11, 2026

NUS-LID / RENAULT

Star

Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning

deep-learning deep-reinforcement-learning ensemble-learning deep-q-learning multi-task-learning deep-rl off-policy auxiliary-tasks model-free-rl data-efficient-learning

Updated Jul 2, 2021
Python

lionelblonde / giwr-pytorch-complete-history

Star

PyTorch implementation of our work: "Where is the Grass Greener? Revisiting Generalized Policy Iteration for Offline Reinforcement Learning"

reinforcement-learning offline pytorch imitation-learning off-policy

Updated May 27, 2024
Python

lionelblonde / giwr-pytorch

Star

PyTorch implementation of our work: "Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning"

reinforcement-learning offline pytorch imitation-learning off-policy

Updated May 27, 2024
Python

raja-grewal / rlmd

Star

PROJECT MIGRATED TO CODEBERG - Reinforcement Learning in Multiplicative Domains

Updated Sep 26, 2023

jangwonkim-cocel / UD7

Star

(Neurocomputing) Source-code of the paper: Provable Generalization of Clipped Double Q-Learning for Variance Reduction and Sample Efficiency

reinforcement-learning pytorch rl off-policy uboc ud7

Updated Feb 24, 2026
Python

lionelblonde / sam-tf-complete-history

Star

TensorFlow implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"

tensorflow gan imitation-learning gail off-policy reinfrocement-learning

Updated Mar 8, 2019
Python

StavrosOrf / DistanceRL

Star

Experimenting with State-Action Distance RL

reinforcement-learning reinforcement-learning-algorithms gymnasium mujoco off-policy

Updated May 4, 2026
Python

fardinabbasi / Tabulated_RL

Star

Containing a custom-built Reinforcement Learning environment and implementations of key RL algorithms like Q-learning and SARSA, tested in scenarios such as a drone navigation challenge and the Frozen Lake environment.

q-learning mdp grid-world sarsa markov-decision-processes value-iteration tree-backup on-policy off-policy