Lists (8)
Sort Name ascending (A-Z)
Stars
Official implementation for paper "Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe"
Mobile-Agent: The Powerful GUI Agent Family
UniScientist is designed to advance universal scientific research intelligence through a unified paradigm
RLAnything & DemyAgent: General and scalable agentic RL algorithms across terminal, GUI, SWE, and tool-call settings
Dr. MAS is an end-to-end RL training framework for multi-agent LLM systems, supporting the co-training of multiple (heterogeneous) LLMs.
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
Elevate your AI research writing, no more tedious polishing ✨
This repository contains the code and data for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards".
DeepResearch Bench II (DRB2) is the follow-up to DeepResearch Bench, with a stronger focus on measuring the gap between deep research systems and human experts. It does so by decomposing expert-wri…
qqr is an RL training framework for open-ended agents.
We introduce BabyVision, a benchmark revealing the infancy of AI vision.
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning
Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to agent intelligence.
Develop review and rebuttal agents for openreview website
Public quant internship repository, maintained by NUFT but available for everyone.
[ICLR 2026] InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
[ICLR'26] SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models
(ICLR'26 + Netflix) Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
[ICLR 2026] VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
Open source code for ICLR 2026 Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification
The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"
An Open-Source Large-Scale Reinforcement Learning Project for Search Agents
🏆 Top-1 on 5+ benchmarks | Web UI | Supports MiroThinker, Claude, Kimi, OpenAI
[IEEE Intelligent Systems] Awesome-Graph-augmented-LLM-Agent (GLA)
[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.