Skip to content
View zhanjiqing's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report zhanjiqing

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

My learning notes for ML SYS.

Python 4,699 298 Updated Dec 19, 2025

A version of verl to support diverse tool use

Python 762 63 Updated Dec 10, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 1,047 74 Updated Nov 25, 2025

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

Python 7,258 698 Updated Nov 19, 2025
Python 439 46 Updated Nov 25, 2025
Python 1,035 63 Updated Nov 20, 2025

Building Open-Ended Embodied Agents with Internet-Scale Knowledge

Java 2,095 185 Updated Mar 18, 2024

An Open-Ended Embodied Agent with Large Language Models

JavaScript 6,533 622 Updated Apr 3, 2024
Python 249 19 Updated Aug 12, 2025

R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Python 668 47 Updated Aug 5, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,673 309 Updated Nov 13, 2025

ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning

Python 1,266 76 Updated May 16, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,233 252 Updated Dec 19, 2025

Train your Agent model via our easy and efficient framework

Python 1,664 156 Updated Dec 5, 2025

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 8,048 643 Updated Dec 19, 2025

The absolute trainer to light up AI agents.

Python 9,740 790 Updated Dec 19, 2025
Python 1,327 95 Updated Dec 18, 2025
Python 385 18 Updated Dec 16, 2025

TROLL: Trust Region Optimization for Large Language models

Python 7 Updated Nov 27, 2025

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning

Python 277 60 Updated Nov 3, 2025
Python 610 54 Updated Dec 16, 2025

(best/better) practices of megatron on veRL and tuning guide

Shell 111 8 Updated Sep 26, 2025
Python 17 13 Updated Sep 22, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,285 327 Updated Dec 15, 2025
Python 19 1 Updated Dec 17, 2025

Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.

48 1 Updated Nov 11, 2025

A set of examples based on verl for end-to-end RL training recipes.

Python 68 7 Updated Dec 1, 2025

[ASPLOS'26] Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter

Python 113 8 Updated Dec 5, 2025

A construction kit for reinforcement learning environment management.

Python 250 26 Updated Dec 19, 2025

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,578 655 Updated Dec 19, 2025
Next