An elegant PyTorch deep reinforcement learning library.
-
Updated
Apr 3, 2026 - Python
An elegant PyTorch deep reinforcement learning library.
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
Python library for Reinforcement Learning.
Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow
Deep Reinforcement Learning with pytorch & visdom
This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are still in progress)
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Master classic RL, deep RL, distributional RL, inverse RL, and more using OpenAI Gym and TensorFlow with extensive Math
PyTorch implementation of Trust Region Policy Optimization
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)
🚀 A fast safe reinforcement learning library in PyTorch
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
ROS 2 enabled Machine Learning algorithms
Tensorflow implementation of generative adversarial imitation learning
Implementations of deep RL papers and random experimentation
Basic reinforcement learning algorithms. Including:DQN,Double DQN, Dueling DQN, SARSA, REINFORCE, baseline-REINFORCE, Actor-Critic,DDPG,DDPG for discrete action space, A2C, A3C, TD3, SAC, TRPO
基于Qwen2+SFT+DPO的医疗问答系统,项目中使用了自定义的 SFTTrainer/DPOTrainer/TRPOTrainer用于训练,其次,项目还调用各种知识库工具(neo4j, milvus, LDA, 等)进行自动化训练数据生成。另外,使用 vllm 用于推理和部署训好的模型, 该模型会通过 vllm API 来接入一个基于 embedder + Reranker 的 RAG 系统。另外还参考 MDAgents 论文实现了一个多智能体会诊系统,同样也支持 vllm api 接入。
Add a description, image, and links to the trpo topic page so that developers can more easily learn about it.
To associate your repository with the trpo topic, visit your repo's landing page and select "manage topics."