SDXinMa

WangXian SDXinMa

14 followers · 79 following

Jilin University
Jinlin,China

Stars

RL

14 repositories

XinJingHao / DRL-Pytorch

Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Python 3,175 382 Updated Jun 11, 2025

AI4Finance-Foundation / ElegantRL

Massively Parallel Deep Reinforcement Learning. 🔥

Python 4,256 965 Updated Dec 6, 2025

Lizhi-sjtu / DRL-code-pytorch

Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.

Python 1,423 204 Updated Mar 29, 2023

thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.

Python 9,002 1,200 Updated Dec 1, 2025

opendilab / PPOxFamily

PPO x Family DRL Tutorial Course（决策智能入门级公开课：8节课帮你盘清算法理论，理顺代码逻辑，玩转决策AI应用实践）

Python 2,459 204 Updated Mar 13, 2025

KhoomeiK / LlamaGym

Fine-tune LLM agents with online reinforcement learning

Python 1,248 62 Updated Mar 19, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,720 2,370 Updated Dec 20, 2025

AmazingAng / WTF-DeepRL

Deep RL algorithm in pytorch

Jupyter Notebook 315 64 Updated Sep 5, 2023

PKU-MARL / DexterousHands

This is a library that provides dual dexterous hand manipulation tasks through Isaac Gym

Python 938 115 Updated Feb 18, 2025

WindyLab / LLM-RL-Papers

Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.

531 36 Updated Nov 17, 2025

vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 8,572 929 Updated Jul 8, 2025

unitreerobotics / unitree_rl_gym

Python 2,688 436 Updated Jul 25, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,811 281 Updated Aug 3, 2025

mll-lab-nu / RAGEN

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Jupyter Notebook 2,446 194 Updated Dec 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly