Lists (8)
Sort Name ascending (A-Z)
Stars
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
Elevate your AI research writing, no more tedious polishing ✨
This repository contains the code and data for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards".
qqr is an RL training framework for open-ended agents.
We introduce BabyVision, a benchmark revealing the infancy of AI vision.
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning
Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to agent intelligence.
Develop review and rebuttal agents for openreview website
Public quant internship repository, maintained by NUFT but available for everyone.
[ICLR 2026] InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
Official code for our paper: "SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models".
(ICLR'26 + Netflix) Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
[ICLR 2026] VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
Open source code for ICLR 2026 Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification
The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"
An Open-Source Large-Scale Reinforcement Learning Project for Search Agents
MiroFlow is an agent framework that enables tool-use agent tasks, featuring a reproducible GAIA score of 82.4%.
[IEEE Intelligent Systems] Awesome-Graph-augmented-LLM-Agent (GLA)
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
[NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph