Skip to content
View WxxShirley's full-sized avatar
🤔
focus
🤔
focus

Block or report WxxShirley

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official implementation for paper "Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe"

Python 11 Updated Mar 24, 2026

Mobile-Agent: The Powerful GUI Agent Family

Python 8,314 838 Updated Mar 26, 2026

UniScientist is designed to advance universal scientific research intelligence through a unified paradigm

Python 151 10 Updated Mar 14, 2026

RLAnything & DemyAgent: General and scalable agentic RL algorithms across terminal, GUI, SWE, and tool-call settings

Python 412 49 Updated Feb 27, 2026

Dr. MAS is an end-to-end RL training framework for multi-agent LLM systems, supporting the co-training of multiple (heterogeneous) LLMs.

Python 124 8 Updated Feb 11, 2026

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 2,885 363 Updated Mar 26, 2026

Elevate your AI research writing, no more tedious polishing ✨

14,082 1,091 Updated Mar 25, 2026

This repository contains the code and data for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards".

Python 59 6 Updated Mar 14, 2026

DeepResearch Bench II (DRB2) is the follow-up to DeepResearch Bench, with a stronger focus on measuring the gap between deep research systems and human experts. It does so by decomposing expert-wri…

Python 37 2 Updated Feb 24, 2026

qqr is an RL training framework for open-ended agents.

Python 227 20 Updated Mar 25, 2026

We introduce BabyVision, a benchmark revealing the infancy of AI vision.

Python 203 7 Updated Jan 13, 2026

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Python 334 14 Updated Feb 5, 2026

Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to agent intelligence.

64 3 Updated Jan 28, 2026

Develop review and rebuttal agents for openreview website

Python 6 Updated Dec 15, 2025

Public quant internship repository, maintained by NUFT but available for everyone.

OCaml 2,097 141 Updated Oct 19, 2025

[ICLR 2026] InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models

Python 50 1 Updated Feb 12, 2026

[ICLR'26] SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models

Python 17 2 Updated Mar 26, 2026

(ICLR'26 + Netflix) Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Python 39 5 Updated Nov 17, 2025

[ICLR 2026] VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

Python 104 10 Updated Feb 22, 2026

Open source code for ICLR 2026 Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions

Python 273 42 Updated Jan 27, 2026

Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification

Python 22 1 Updated Oct 8, 2025
Python 290 19 Updated Jan 3, 2026

The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"

Python 111 2 Updated Sep 29, 2025
Python 1,403 127 Updated Sep 12, 2025

An Open-Source Large-Scale Reinforcement Learning Project for Search Agents

Python 569 37 Updated Nov 26, 2025

🏆 Top-1 on 5+ benchmarks | Web UI | Supports MiroThinker, Claude, Kimi, OpenAI

Python 2,853 300 Updated Mar 23, 2026

[IEEE Intelligent Systems] Awesome-Graph-augmented-LLM-Agent (GLA)

68 3 Updated Nov 17, 2025

[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)

Python 920 49 Updated Jan 28, 2026

Democratizing Reinforcement Learning for LLMs

Python 5,290 525 Updated Mar 26, 2026

A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.

Python 958 67 Updated Jul 31, 2025
Next