Skip to content
View WxxShirley's full-sized avatar
🤔
focus
🤔
focus

Block or report WxxShirley

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

(Netflix 2025) Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Python 20 2 Updated Nov 3, 2025

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

Python 47 4 Updated Nov 4, 2025

Open source code for Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions

Python 117 18 Updated Oct 7, 2025

Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification

Python 20 1 Updated Oct 8, 2025
Python 149 12 Updated Oct 27, 2025

The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"

Python 84 1 Updated Sep 29, 2025
Python 1,332 120 Updated Sep 12, 2025

An Open-Source Large-Scale Reinforcement Learning Project for Search Agents

Python 486 29 Updated Oct 8, 2025

MiroMind Research Agent: Fully Open-Source Deep Research Agent with Reproducible State-of-the-Art Performance on FutureX, GAIA, HLE, BrowserComp and xBench.

Python 806 100 Updated Nov 5, 2025

[IEEE Intelligent Systems] Awesome-Graph-augmented-LLM-Agent (GLA)

20 1 Updated Nov 1, 2025

The official code of ARPO & AEPO

Python 758 35 Updated Nov 5, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,668 440 Updated Nov 4, 2025

A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.

Python 769 55 Updated Jul 31, 2025

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25

Python 78 6 Updated Jun 16, 2025

[NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Python 74 6 Updated Sep 19, 2025

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 18,278 2,120 Updated Sep 24, 2025

Get started with building Fullstack Agents using Gemini 2.5 and LangGraph

Jupyter Notebook 17,238 2,937 Updated Oct 21, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,135 96 Updated Oct 20, 2025
Python 9 Updated Sep 25, 2025

[arXiv 2025] Materials Generation in the Era of Artificial Intelligence: A Comprehensive Survey

34 1 Updated Jun 29, 2025

Code, Data and Model for Paper "Learning from Peers in Reasoning Models"

Python 26 2 Updated May 13, 2025

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Python 1,182 110 Updated Aug 16, 2025

[ICML 2025] Official implementation for paper "A Comprehensive Analysis on LLM-based Node Classification Algorithms"

Python 63 4 Updated Jul 1, 2025

A Sober Look at Language Model Reasoning

HTML 87 4 Updated Oct 9, 2025

✨ Official code for our paper: "Uncertainty-o: One Model-agnostic Framework for Unveiling Epistemic Uncertainty in Large Multimodal Models".

Python 17 3 Updated Mar 13, 2025

Open replication of DeepSeek R1 for text-to-graph extraction.

Python 99 15 Updated Jan 31, 2025
Python 31 2 Updated Aug 4, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,924 3,918 Updated Nov 5, 2025
Next