Skip to content
View OE-Heart's full-sized avatar

Organizations

@QSCTech @zjunlp

Block or report OE-Heart

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

slime is an LLM post-training framework for RL Scaling.

Python 3,650 487 Updated Feb 3, 2026

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,763 207 Updated Feb 4, 2026

Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

TypeScript 27,202 2,116 Updated Jan 10, 2026

Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments

Python 222 10 Updated Dec 16, 2025
Python 1,387 124 Updated Sep 12, 2025

青稞Talk

190 1 Updated Jan 21, 2026

DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coo…

JavaScript 3,098 408 Updated Sep 29, 2025

An open platform for enhancing the capability of LLMs in workflow orchestration.

Python 183 24 Updated Mar 11, 2025

Verifiers for LLM Reinforcement Learning

Python 80 11 Updated Apr 15, 2025

The official code of ARPO & AEPO

Python 880 41 Updated Jan 28, 2026

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

Python 12,711 1,313 Updated Jan 17, 2026

Democratizing Reinforcement Learning for LLMs

Python 5,071 496 Updated Feb 4, 2026

Scaling RL on advanced reasoning models

Python 661 41 Updated Oct 20, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

10,257 773 Updated Jan 21, 2026
Python 219 11 Updated Jun 2, 2025

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Shell 60,235 12,332 Updated Jan 27, 2026

Minimal reproduction of DeepSeek R1-Zero

Python 12,672 1,549 Updated Apr 24, 2025

CycleResearcher: Improving Automated Research via Automated Review

Jupyter Notebook 330 32 Updated Jul 10, 2025

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 8,542 708 Updated Feb 4, 2026

xLAM: A Family of Large Action Models to Empower AI Agent Systems

Python 600 50 Updated Aug 21, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,492 130 Updated Jan 30, 2026

Pocket Flow: 100-line LLM framework. Let Agents build Agents!

Python 9,736 1,077 Updated Dec 24, 2025

The official repository of ALE-Bench

Python 155 21 Updated Feb 3, 2026

AutoMind: Adaptive Knowledgeable Agent for Automated Data Science

Python 84 6 Updated Oct 9, 2025

Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas

Python 5,263 755 Updated Aug 20, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,489 279 Updated Feb 4, 2026

Fully open reproduction of DeepSeek-R1

Python 25,853 2,411 Updated Nov 24, 2025

Linux namespaces and seccomp-bpf sandbox

C 6,996 645 Updated Feb 3, 2026
Next