ziansu

Follow

😼

Seeking truth

Zian Su ziansu

😼

Seeking truth

Follow

CS Ph.D. student | LLM & Language Agents & Knowledge Editing & Experiential Learning

27 followers · 20 following

Purdue University
West Lafayette, Indiana
ziansu.github.io

Achievements

Achievements

Highlights

Pro

Lists (14)

Sort

Agents

Frameworks & Applications

32 repositories

Benchmark

15 repositories

Contrastive Learning

Diffusion Models

20 repositories

Gyms

Agent environments

Inference

Interpretability

20 repositories

LLM

14 repositories

LRM

Large reasoning models / system-2 models

13 repositories

Meta Learning

Multimodal

On-Policy Distillation

Policy Optimization

Visualization

Stars

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 6,209 905 Updated Jun 18, 2026

NVIDIA-NeMo / RL

Scalable toolkit for efficient model reinforcement

Python 1,743 428 Updated Jun 18, 2026

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 16,748 4,101 Updated Jun 18, 2026

ZJU-REAL / SDAR

Official code for "Self-Distilled Agentic Reinforcement Learning"

Python 227 16 Updated May 27, 2026

YoungZ365 / SOD

PyTorch-based open-source code for paper "SOD: Step-wise On-policy Distillation for Small Language Model Agents"

Python 138 8 Updated May 22, 2026

lasgroup / SDPO

Reinforcement Learning via Self-Distillation (SDPO)

Python 957 107 Updated Feb 18, 2026

sjelassi / ebft_openrlhf

Code for "Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models".

Python 23 Updated Mar 16, 2026

LeapLabTHU / JustGRPO

[ICML 2026 Oral] Minimalist RL for Diffusion LLMs. 89.1% on GSM8K.

Python 145 5 Updated Jun 9, 2026

ML-GSAI / ESPO

Official PyTorch implementation for "Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective"

Python 38 2 Updated Jan 25, 2026

inclusionAI / dFactory

Easy and Efficient dLLM Fine-Tuning

Python 259 15 Updated Mar 2, 2026

openai / guided-diffusion

Python 7,392 914 Updated Jul 2, 2024

patrickpynadath1 / candi-diffusion

CANDI: Continuous and Discrete Diffusion

Python 27 1 Updated Oct 27, 2025

facebookresearch / SPG

Code for paper "SPG Sandwiched Policy Gradient for Masked Diffusion Language Models"

Python 60 6 Updated Oct 29, 2025

ManimCommunity / manim

A community-maintained Python framework for creating mathematical animations.

Python 39,059 2,917 Updated Jun 17, 2026

ServiceNow / hdlm

Python 12 1 Updated Oct 2, 2025

cychomatica / FreeDave

Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models

Python 21 Updated May 19, 2026

dllm-reasoning / d1

Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"

Python 446 52 Updated Jan 26, 2026

Crys-Chen / DPad

Official implementation of "DPad: Efficient Diffusion Language Models with Suffix Dropout"

Python 62 5 Updated Feb 13, 2026

JetAstra / SDAR

SDAR (Synergy of Diffusion and AutoRegression), a large diffusion language model（1.7B, 4B, 8B, 30B）

Python 359 22 Updated Jun 2, 2026

NVlabs / DiffusionNFT

[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Python 914 37 Updated Feb 10, 2026

Gar-b-age / CookLikeHOC

🥢像老乡鸡🐔那样做饭。已添加2026年发布的《老乡鸡菜品溯源报告 2.0中新出现的菜品。主要部分于2024年完工，非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》，并做归纳、编辑与整理。CookLikeHOC.

Dockerfile 23,605 2,344 Updated May 8, 2026

edb-rs / edb

EDB: The Ethereum Project Debugger

Rust 364 41 Updated May 10, 2026

Alibaba-NLP / DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 19,453 1,490 Updated Feb 27, 2026

Multiverse4FM / Multiverse

Forked from Infini-AI-Lab/Multiverse

Python 89 2 Updated Jun 16, 2025

sgl-project / sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29,165 6,604 Updated Jun 18, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,253 18,195 Updated Jun 18, 2026

ryantzr1 / OpenAlita

Open Source Implementation of Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

C++ 100 15 Updated Jul 18, 2025

VILA-Lab / Awesome-DLMs

The official GitHub repo for the survey paper "A Survey on Diffusion Language Models".

1,106 52 Updated May 29, 2026

WooooDyy / AgentGym-RL

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.

Python 778 74 Updated Feb 15, 2026

Gen-Verse / dLLM-RL

[ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.

Python 510 43 Updated Jan 28, 2026