Lists (8)
Sort Name ascending (A-Z)
Stars
(Netflix 2025) Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
Open source code for Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification
The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"
An Open-Source Large-Scale Reinforcement Learning Project for Search Agents
MiroMind Research Agent: Fully Open-Source Deep Research Agent with Reproducible State-of-the-Art Performance on FutureX, GAIA, HLE, BrowserComp and xBench.
[IEEE Intelligent Systems] Awesome-Graph-augmented-LLM-Agent (GLA)
Democratizing Reinforcement Learning for LLMs
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
[NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
[arXiv 2025] Materials Generation in the Era of Artificial Intelligence: A Comprehensive Survey
Code, Data and Model for Paper "Learning from Peers in Reasoning Models"
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
[ICML 2025] Official implementation for paper "A Comprehensive Analysis on LLM-based Node Classification Algorithms"
✨ Official code for our paper: "Uncertainty-o: One Model-agnostic Framework for Unveiling Epistemic Uncertainty in Large Multimodal Models".
Open replication of DeepSeek R1 for text-to-graph extraction.
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.