Skip to content
View songmzhang's full-sized avatar

Block or report songmzhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated collection of papers, technical reports, frameworks, and tools for on-policy distillation (OPD) of large language models

353 7 Updated Jun 16, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29,082 6,557 Updated Jun 16, 2026

A user-friendly & efficient knowledge distillation framework for LLMs, supporting off-policy, on-policy (OPD), cross-tokenizer, multimodal, and on-policy self-distillation.

Python 199 15 Updated Jun 5, 2026
Python 5 Updated Sep 11, 2025

Code for EMNLP2023 paper "A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase Generation".

Python 4 Updated Mar 20, 2024

Code for EMNLP-2025 (Findings) paper “CM-Align: Consistency-based Multilingual Alignment for Large Language Models”.

Python 3 Updated Sep 11, 2025
Shell 6 3 Updated Sep 10, 2025

Code for "Think Natively: Unlocking Multilingual Reasoning with Consistency-Enhanced Reinforcement Learning".

Python 27 Updated Nov 11, 2025

Efficient Triton Kernels for LLM Training

Python 6,441 541 Updated Jun 16, 2026

slime is an LLM post-training framework for RL Scaling.

Python 6,153 897 Updated Jun 16, 2026

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …

Python 14,530 1,481 Updated Jun 16, 2026

Ongoing research training transformer models at scale

Python 16,724 4,088 Updated Jun 16, 2026

Code for ACL 2025 Paper "AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation"

Python 3 Updated Aug 26, 2025

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 22,004 4,085 Updated Jun 16, 2026

Retrieval and Retrieval-augmented LLMs

Python 11,832 889 Updated Apr 22, 2026

The official implementation of the paper "A Dual-Space Framework for General Knowledge Distillation of Large Language Models".

Python 15 1 Updated Jan 4, 2026

Arena-Hard-Auto: An automatic LLM benchmark.

Python 1,036 153 Updated Jun 21, 2025

【逐条处理完成】人为审核+修改每一条的弱智吧精选问题QA数据集

257 11 Updated Feb 21, 2026

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Python 9,648 969 Updated Jun 9, 2026

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

Python 66,655 5,978 Updated Jun 16, 2026

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 7,093 790 Updated Jun 15, 2026

A framework for few-shot evaluation of language models.

Python 12,979 3,345 Updated Jun 2, 2026
Python 33 7 Updated Mar 13, 2024

Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same-tokenizer and cross-tokenizer LLM distillation.

Python 63 12 Updated Mar 21, 2026

This resposity maintains a collection of important papers on knowledge distillation (awesome-knowledge-distillation)).

85 17 Updated Mar 19, 2025

PyTorch native post-training library

Python 5,773 729 Updated Jun 16, 2026

Awesome LLM compression research papers and tools.

1,846 128 Updated Feb 23, 2026

This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…

1,291 72 Updated Mar 9, 2025

Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)

Python 267 29 Updated Mar 13, 2025

fay是一个帮助数字人(2.5d、3d、移动、pc、网页)或大语言模型(openai兼容、deepseek)连通业务系统的agent框架。

Python 12,873 2,288 Updated May 29, 2026
Next