-
Zhejiang University
- Hangzhou, China
- https://oe-heart.github.io/
Highlights
Stars
slime is an LLM post-training framework for RL Scaling.
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coo…
An open platform for enhancing the capability of LLMs in workflow orchestration.
bespokelabsai / verifiers
Forked from PrimeIntellect-ai/verifiersVerifiers for LLM Reinforcement Learning
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
Kimi K2 is the large language model series developed by Moonshot AI team
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
Minimal reproduction of DeepSeek R1-Zero
CycleResearcher: Improving Automated Research via Automated Review
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
xLAM: A Family of Large Action Models to Empower AI Agent Systems
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
Pocket Flow: 100-line LLM framework. Let Agents build Agents!
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science
Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Fully open reproduction of DeepSeek-R1