Stars
HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling
Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
slime is an LLM post-training framework for RL Scaling.
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
Minimal reproduction of DeepSeek R1-Zero
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
[ACL-2026] MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning
IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent
[NeurIPS 2025] Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning
A Searching-based Agent Model for Open-Domain Open-Ended Question Answering
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
Repo for "MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability"
Multimodal Retrieval-augmented Generation Framework Built by Tongyi Lab, Alibaba Group.
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
[ACL 2026] R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning
EMNLP MAIN 2025 StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
[NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
An Open-Source Large-Scale Reinforcement Learning Project for Search Agents
Code and Data for Paper "AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning"
A version of verl to support diverse tool use [TMLR 2026]
[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning
Welcome! 😊 This is the official code release of EviNote-RAG, and we’re happy to share it with the community.
This is the Ofiicial repository for paper: GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning