-
University of Science and Technology of China
- HeFei, Anhui
-
18:49
(UTC +08:00) - https://ustc.edu.cn
- https://orcid.org/0009-0003-2518-554X
- https://changshuoshen.github.io/
- https://changshuoshen.github.io/blog/
Highlights
- Pro
Stars
SGLang is a high-performance serving framework for large language models and multimodal models.
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
[ICML 2025] An official source code for paper "FlipAttack: Jailbreak LLMs via Flipping".
Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
一个开源的GPU服务器管理平台;可以实时查看模型训练状态、GPU资源占用、模型训练日志、IP访问记录等
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
collecting publicly available distillation datasets based on DepSeek-R1
Fast and memory-efficient exact attention
[NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient
ZJU-REAL / EasySteer-vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
A Unified Framework for High-Performance and Extensible LLM Steering
[COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free
Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]
Code for paper "The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning"
AlphaLab-USTC / A2A
Forked from a2aproject/A2AAn open protocol enabling communication and interoperability between opaque agentic applications.
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
[ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
[NeurIPS 2025] Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning
Automated tool for running Python programs in a streamlined manner
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis
[ICLR 2025 Oral 🏆] The implementation of paper "Language Representations Can be What Recommenders Need: Findings and Potentials"
[SIGIR 2024 perspective] The implementation of paper "On Generative Agents in Recommendation"
verl: Volcano Engine Reinforcement Learning for LLMs