Skip to content
View pandengyao's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report pandengyao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pandengyao/README.md

Hi there 👋

I am currently a Research & Development Engineer in the Reinforcement Learning Group of the AI Computing Department at Baidu.
I received my M.S. degree in Instrument Science and Technology from Beihang University (BUAA), and my B.Eng. in Measurement, Control Technology and Instrumentation from Nanjing University of Science and Technology (NJUST).

With five years of industry experience, I have worked across areas including software development, ROS-based system integration, model quantization and deployment, and MLSys optimization.
My current research focuses on Agentic Reinforcement Learning (Agentic RL) — exploring how autonomous agents can leverage reinforcement learning to enhance large-scale intelligent systems.

Research Interests 🔭

My research primarily focuses on:

  • ML Systems: Topics related to SGLang, veRL, AI Infra, and High Performance Computing.
  • RL Sys for Agents: Topics related to Coding Agent & Pipeline and RLHF for Multi-Agent Systems.

Pinned Loading

  1. volcengine/verl volcengine/verl Public

    verl: Volcano Engine Reinforcement Learning for LLMs

    Python 17.7k 2.9k

  2. sgl-project/sglang sgl-project/sglang Public

    SGLang is a fast serving framework for large language models and vision language models.

    Python 21.8k 3.8k

  3. zhaochenyang20/Awesome-ML-SYS-Tutorial zhaochenyang20/Awesome-ML-SYS-Tutorial Public

    My learning notes for ML SYS.

    Python 4.7k 299

  4. THUDM/slime THUDM/slime Public

    slime is an LLM post-training framework for RL Scaling.

    Python 2.9k 352