Skip to content
View gongel's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@PaddlePaddle

Block or report gongel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Cutting-edge platform for LLM agent tuning. Deliver RL tuning with flexibility, reliability, speed, multi-agent optimization and realtime community benchmarking.

Python 219 24 Updated Jun 4, 2026
Python 64 4 Updated Jun 20, 2026

An LLM post-training framework with vLLM for RL Scaling

Python 290 30 Updated Jun 22, 2026

Agentic RL on Any Harness at Scale

Python 580 61 Updated Jun 17, 2026

An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

Python 432 51 Updated Jun 18, 2026

The agent that grows with you

Python 199,299 35,398 Updated Jun 22, 2026

Claude Code 泄露源码 - 本地可运行版本,新增跨平台桌面端软件补齐Computer Use(附带核心模块解析)

TypeScript 12,797 8,295 Updated Jun 20, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,515 598 Updated May 23, 2026

The open source coding agent.

TypeScript 177,111 21,622 Updated Jun 22, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,437 707 Updated May 17, 2026

SkyRL: A Modular Full-stack RL Library for LLMs

Python 2,015 358 Updated Jun 22, 2026

[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning

Python 376 37 Updated Jan 26, 2026

Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.

Python 93 6 Updated Jan 29, 2026

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 19,512 1,491 Updated Feb 27, 2026

[ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents

Python 107 5 Updated Apr 23, 2026

[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)

Python 1,056 60 Updated Apr 13, 2026

[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 507 52 Updated Mar 30, 2026

Bridge Megatron-Core to Hugging Face/Reinforcement Learning

Python 220 76 Updated Jun 15, 2026

Self-evolving memory OS for LLM & AI Agents: ultra-persistent memory, hybrid-retrieval, and cross-task skill reuse, with 35.24% token savings

TypeScript 9,948 906 Updated Jun 22, 2026

slime is an LLM post-training framework for RL Scaling.

Python 6,642 956 Updated Jun 21, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,521 18,302 Updated Jun 22, 2026
Python 62 5 Updated Jul 21, 2025

Scalable toolkit for efficient model reinforcement

Python 1,749 430 Updated Jun 22, 2026

Agentic RL Training at Scale

Python 1,500 315 Updated Jun 22, 2026

Muon is an optimizer for hidden layers in neural networks

Python 2,673 125 Updated May 24, 2026

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 744 44 Updated Jun 6, 2025

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 479 26 Updated May 17, 2025

Democratizing Reinforcement Learning for LLMs

Python 5,636 577 Updated Jun 22, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,625 867 Updated Jun 22, 2026

FireFlyer Record file format, writer and reader for DL training samples.

Python 249 24 Updated Dec 1, 2022
Next