agentic-rl

Here are 31 public repositories matching this topic...

walkinglabs / hands-on-modern-rl

🚀 An open-source, hands-on curriculum bridging the gap from basic RL concepts to LLM alignment, RLVR, and advanced Agentic systems.

agent tutorial pytorch dpo reinforcemen llm rlhf agentic agentic-ai grpo llm-alignment agentic-rl

Updated Jul 3, 2026
Python

AgentR1 / Agent-R1

Star

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

agent llm agentic-rl

Updated Jul 21, 2026
Python

redai-infra / Relax

Star

An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

reinforcement-learning multi-agent vlm distributed-training post-training multimodal megatron-lm llm ray-serve rlhf qwen sglang grpo agentic-rl

Updated Jul 22, 2026
Python

rlops / rlix

Star

Run more RL experiments. Wait less for GPUs.

reinforcement-learning rl lora tinker mlops ml-systems gpu-scheduling llm-training agentic-rl

Updated Jul 19, 2026
Python

InternLM / ARM-Thinker

Star

[CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"

vlm llm vision-language-model reward-modeling agentic-rl think-with-image

Updated Feb 13, 2026
Python

AgentR1 / Claw-R1

Star

Claw-R1: Empowering OpenClaw with Advanced Agentic RL.

agent agentic-rl openclaw

Updated Jun 9, 2026
Python

AMAP-ML / Thinking-with-Map

Star

[ACL 2026 Findings] Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

agent reasoning geo-localization mllm agentic-rl

Updated Mar 9, 2026
Python

XiaoRed5 / Agentic-RL-Most-Detailed-Intro

Star

Agentic RL最详细入门

tutorial reinforcement-learning credit-assignment llm-agents agentic-rl

Updated Jul 23, 2026
HTML

0bserver07 / Study-Reinforcement-Learning

Star

RL study guide — foundations through RLHF, DPO, GRPO, RLVR, agentic RL, and offline RL. Hand-written CS294 notes, 19 lecture drafts, 5 tested exercises, citations that resolve.

machine-learning reinforcement-learning deep-learning q-learning policy-gradient study-notes lecture-notes ppo dpo rlhf constitutional-ai deepseek-r1 grpo llm-alignment rlvr sutton-barto agentic-rl

Updated Jul 1, 2026
Python

xxzcc / Awesome-Credit-Assignment-in-LLM-RL

Star

Curated papers, taxonomy, benchmarks, and decision guides for credit assignment in reasoning and agentic LLM reinforcement learning.

awesome reinforcement-learning awesome-list multi-agent-systems credit-assignment large-language-models llm process-reward-model rlvr agentic-rl

Updated Jul 22, 2026
Python

Computer-use-agents / dart-gui

Star

DART-GUI: Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation

gui-agent computer-use-agent agentic-rl

Updated Feb 26, 2026
Python

strands-rl / strands-sglang

Star

SGLang model provider of Strands Agents for on-policy agentic RL training.

ai-agents sglang strands-agents agentic-rl

Updated Jul 18, 2026
Python

Curated, opinionated index of post-R1 LLM × Reinforcement Learning. Many deep-dive blog posts cross-linked to many papers — GRPO, DAPO, DPO, PPO, RLHF, GSPO, CISPO, VAPO, Reward Modeling, MoE RL stability, Verifier-Free RL, Training-Free RL, Agentic RL, DeepSeek-R1 reproduction.

Updated Jul 13, 2026

horizon-llm / AlphaQuanter

Star

[ACL2026] AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement Learning Framework for Stock Trading.

agent agentic-rl

Updated Jul 3, 2026
Python

FlyTune / ProxMO-RL

Star

Proximity-based Multi-turn Optimization (ProxMO) - Official Implementation

efficiency rl llm agentic-rl

Updated Mar 29, 2026
Python

X-PLUG / ToolCUA

Star

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

sandbox-environment mllm gui-agent computer-use-agent agentic-rl

Updated May 13, 2026
Python

strands-rl / strands-env

Star

A framework for building agent environments for RL training and evaluation with Strands Agents.

ai-agents strands-agents agentic-rl agent-environments

Updated Jul 22, 2026
Python

EvolvingLMMs-Lab / ParaVT

Star

ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning

reinforcement-learning tool-use long-video-understanding video-llm grpo agentic-rl multimodal-rl

Updated Jun 2, 2026
Python

thu-unicorn / Doctor-R1

Star

This is the official repository for our paper "Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning" published in ICRL 2026.

experience medical-ai agentic-rl

Updated Apr 11, 2026
Python

WxxShirley / Agent-STAR

Star

Official implementation for paper "Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe"

agent reinforcement-learning reinforcement-learning-agent agentic-rl

Updated May 12, 2026
Python

Improve this page

Add a description, image, and links to the agentic-rl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agentic-rl topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agentic-rl

Here are 31 public repositories matching this topic...

walkinglabs / hands-on-modern-rl

AgentR1 / Agent-R1

redai-infra / Relax

rlops / rlix

InternLM / ARM-Thinker

AgentR1 / Claw-R1

AMAP-ML / Thinking-with-Map

XiaoRed5 / Agentic-RL-Most-Detailed-Intro

0bserver07 / Study-Reinforcement-Learning

xxzcc / Awesome-Credit-Assignment-in-LLM-RL

Computer-use-agents / dart-gui

strands-rl / strands-sglang

hscspring / rl-llm-nlp

horizon-llm / AlphaQuanter

FlyTune / ProxMO-RL

X-PLUG / ToolCUA

strands-rl / strands-env

EvolvingLMMs-Lab / ParaVT

thu-unicorn / Doctor-R1

WxxShirley / Agent-STAR

Improve this page

Add this topic to your repo