Dyx77777

Follow

🎯

Focusing

Yixiao Duan Dyx77777

🎯

Focusing

Follow

1 follower · 2 following

https://www.kaggle.com/dyx937657

Stars

bytedance / HLLM

HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling

Python 623 82 Updated Aug 26, 2025

AkaliKong / MiniOneRec

Minimal reproduction of OneRec

Python 1,640 233 Updated May 14, 2026

meta-recsys / generative-recommenders

Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).

Python 1,921 391 Updated Jun 12, 2026

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 72,156 8,827 Updated Jun 13, 2026

areal-project / AReaL

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,305 520 Updated Jun 14, 2026

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 6,118 895 Updated Jun 14, 2026

verl-project / verl

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,961 4,074 Updated Jun 14, 2026

deepseek-ai / DeepSeek-R1

92,008 11,713 Updated Jun 27, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 13,165 1,585 Updated Feb 27, 2026

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 4,935 442 Updated Nov 13, 2025

GAIR-NLP / DeepResearcher

Scaling Deep Research via Reinforcement Learning in Real-world Environments.

Python 769 53 Updated May 10, 2026

EvolvingLMMs-Lab / multimodal-search-r1

[ACL-2026] MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.

Python 458 25 Updated Apr 7, 2026

Agent-RL / ReCall

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning

Python 1,390 86 Updated May 16, 2025

hzy312 / knowledge-r1

IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent

Python 70 4 Updated May 13, 2025

syr-cn / AutoRefine

[NeurIPS 2025] Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning

Python 138 7 Updated Dec 13, 2025

KnowledgeXLab / O2-Searcher

A Searching-based Agent Model for Open-Domain Open-Ended Question Answering

Python 39 4 Updated Jun 20, 2025

Alibaba-NLP / ZeroSearch

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Python 1,295 120 Updated Aug 16, 2025

Alibaba-NLP / MaskSearch

Repo for "MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability"

Python 155 8 Updated May 27, 2025

Alibaba-NLP / VRAG

Multimodal Retrieval-augmented Generation Framework Built by Tongyi Lab, Alibaba Group.

Python 948 91 Updated Apr 29, 2026

yongchao98 / R1-Code-Interpreter

R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning

Python 43 5 Updated Feb 9, 2026

QingFei1 / R-Search

[ACL 2026] R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning

Python 34 1 Updated Jan 4, 2026

Zillwang / StepSearch

EMNLP MAIN 2025 StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization

Python 72 11 Updated Sep 13, 2025

ltzheng / SimpleTIR

[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Python 389 24 Updated Mar 30, 2026

ulab-uiuc / Router-R1

[NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Python 136 15 Updated Dec 30, 2025

inclusionAI / ASearcher

An Open-Source Large-Scale Reinforcement Learning Project for Search Agents

Python 593 38 Updated Nov 26, 2025

weiyifan1023 / AutoTIR

Code and Data for Paper "AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning"

Python 53 6 Updated Sep 4, 2025

TIGER-AI-Lab / verl-tool

A version of verl to support diverse tool use [TMLR 2026]

Python 998 83 Updated Jun 8, 2026

AMAP-ML / Tree-GRPO

[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning

Python 370 35 Updated Jan 26, 2026

Da1yuqin / EviNoteRAG

Welcome! 😊 This is the official code release of EviNote-RAG, and we’re happy to share it with the community.

Python 48 5 Updated Jun 4, 2026

CarnegieBin / GlobalRAG

This is the Ofiicial repository for paper: GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning

Python 14 2 Updated May 3, 2026