Skip to content
View SDXinMa's full-sized avatar
  • Jilin University
  • Jinlin,China

Block or report SDXinMa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

RL

14 repositories

Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Python 3,175 382 Updated Jun 11, 2025

Massively Parallel Deep Reinforcement Learning. 🔥

Python 4,256 965 Updated Dec 6, 2025

Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.

Python 1,423 204 Updated Mar 29, 2023

An elegant PyTorch deep reinforcement learning library.

Python 9,002 1,200 Updated Dec 1, 2025

PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )

Python 2,459 204 Updated Mar 13, 2025

Fine-tune LLM agents with online reinforcement learning

Python 1,248 62 Updated Mar 19, 2024

Train transformer language models with reinforcement learning.

Python 16,720 2,370 Updated Dec 20, 2025

Deep RL algorithm in pytorch

Jupyter Notebook 315 64 Updated Sep 5, 2023

This is a library that provides dual dexterous hand manipulation tasks through Isaac Gym

Python 938 115 Updated Feb 18, 2025

Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.

531 36 Updated Nov 17, 2025

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 8,572 929 Updated Jul 8, 2025

Simple RL training for reasoning

Python 3,811 281 Updated Aug 3, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Jupyter Notebook 2,446 194 Updated Dec 3, 2025