Stars
ReMe: Memory Management Kit for Agents - Remember Me, Refine Me.
AgentEvolver: Towards Efficient Self-Evolving Agent System
The official codebase for our paper, FLEX: Continuous Agent Evolution via Forward Learning from Experience.
🌎💪 BrowserGym, a Gym environment for web task automation
Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
An RL Recipe for Building Agentic LLMs via Self-Imitation on Long-Horizon Agentic Tasks
MemGen: Weaving Generative Latent Memory for Self-Evolving Agents
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).
ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels (easy/hard) across eight real-life scenarios.
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.
This repository collects awesome survey, resource, and paper for lifelong learning LLM agents
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
Self-Evolving Agent via Experience-Driven Lifelong Learning
An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]
MiroMind Research Agent: Fully Open-Source Deep Research Agent with Reproducible State-of-the-Art Performance on FutureX, GAIA, HLE, BrowserComp and xBench.
🔥 Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.
[EMNLP 2024] CompAct: Compressing Retrieved Documents Actively for Question Answering
The last data dump of Freebase with introductory explanation of its schema
Build Real-Time Knowledge Graphs for AI Agents
Code repo for "LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners"
A High-Efficiency System of Large Language Model Based Search Agents