-
Beijing Jiaotong University
- Beijing
-
06:39
(UTC +08:00) - https://songmzhang.github.io/
Stars
A curated collection of papers, technical reports, frameworks, and tools for on-policy distillation (OPD) of large language models
SGLang is a high-performance serving framework for large language models and multimodal models.
A user-friendly & efficient knowledge distillation framework for LLMs, supporting off-policy, on-policy (OPD), cross-tokenizer, multimodal, and on-policy self-distillation.
Code for EMNLP2023 paper "A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase Generation".
Code for EMNLP-2025 (Findings) paper “CM-Align: Consistency-based Multilingual Alignment for Large Language Models”.
Code for "Think Natively: Unlocking Multilingual Reasoning with Consistency-Enhanced Reinforcement Learning".
Efficient Triton Kernels for LLM Training
slime is an LLM post-training framework for RL Scaling.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …
Ongoing research training transformer models at scale
Code for ACL 2025 Paper "AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation"
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
Retrieval and Retrieval-augmented LLMs
The official implementation of the paper "A Dual-Space Framework for General Knowledge Distillation of Large Language Models".
Arena-Hard-Auto: An automatic LLM benchmark.
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
A framework for few-shot evaluation of language models.
Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same-tokenizer and cross-tokenizer LLM distillation.
This resposity maintains a collection of important papers on knowledge distillation (awesome-knowledge-distillation)).
Awesome LLM compression research papers and tools.
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…
Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)
fay是一个帮助数字人(2.5d、3d、移动、pc、网页)或大语言模型(openai兼容、deepseek)连通业务系统的agent框架。