Polly-LYP

Yanping LI Polly-LYP

PhD at HKUST

13 followers · 192 following

Lists (26)

Sort

Stars

xingzhejun / d-opsd-code

The code for paper "Learning from the Self-future:On-policy Self-distillation for dLLMs"

Python 6 1 Updated Jun 18, 2026

HarryHsing / OmniAgent

OmniAgent (ICML 2026): the first native omni-modal agent for active video perception — a 7B agent that beats Qwen2.5-VL-72B with 73% fewer frames.

Python 30 2 Updated Jun 18, 2026

kmswin1 / TSD-KD

Explain in Your Own Words: Improving Reasoning via Token-Selective Dual Knowledge Distillation, The Fourteenth International Conference on Learning Representations (ICLR) 2026, Accepted

Python 10 3 Updated May 9, 2026

machine981 / SCOPE

SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting

Python 25 1 Updated Jun 22, 2026

HJSang / OPSD_OnPolicyDistillation

On Policy Distillation Build on top of Verl

Python 86 6 Updated May 25, 2026

louieworth / trd

Official Implementation of Trajectory-Refined Distillation

Python 24 Updated Jun 9, 2026

Leey21 / awesome-ai-research-writing

Elevate your AI research writing, no more tedious polishing ✨

29,258 2,253 Updated May 18, 2026

shareAI-lab / learn-claude-code

Bash is all you need - A nano claude code–like 「agent harness」, built from 0 to 1

Python 67,950 11,052 Updated Jun 22, 2026

Imbad0202 / academic-research-skills

Academic Research Skills for Claude Code: research → write → review → revise → finalize

Python 33,757 2,776 Updated Jun 21, 2026

Anionex / banana-slides

一个基于nano banana pro🍌的原生AI PPT生成应用，迈向＂Vibe PPT＂; 支持上传任意模板图片，上传任意素材&智能解析，一句话/大纲/页面描述自动生成PPT，口头修改指定区域、一键导出可编辑ppt - An AI-native slides generator based on nano banana pro🍌

Python 15,026 1,751 Updated Jun 23, 2026

XiaomiMiMo / MiMo-Code

MiMo Code: Where Models and Agents Co-Evolve

TypeScript 10,382 971 Updated Jun 23, 2026

thinkwee / AwesomeOPD

Awesome List for On-Policy Distillation

670 12 Updated Jun 19, 2026

THU-BPM / RLCSD

Source code of paper "RLCSD: Reinforcement Learning with Contrastive On-Policy Self-Distillation"

Python 42 1 Updated Jun 16, 2026

SII-FLEEECERmw / SafeC-OPSD

Official implementation of "Constitutional On-Policy Safe Distillation"

Python 4 Updated Jun 16, 2026

ZJU-REAL / SDAR

Official code for "Self-Distilled Agentic Reinforcement Learning"

Python 272 19 Updated May 27, 2026

YoungZ365 / SOD

PyTorch-based open-source code for paper "SOD: Step-wise On-policy Distillation for Small Language Model Agents"

Python 139 8 Updated May 22, 2026

kokolerk / TCOD

TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents

Python 69 3 Updated Jun 11, 2026

SaFo-Lab / PW-OPSD

The official implementation of our preprint paper "When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning"

Python 9 1 Updated May 23, 2026

HJSang / CRISP_Reasoning_Compression

Python 60 7 Updated Jun 12, 2026

chrisliu298 / awesome-on-policy-distillation

A curated collection of papers, technical reports, frameworks, and tools for on-policy distillation (OPD) of large language models

385 8 Updated Jun 23, 2026

wuyoscar / GPT-Image2-Skill

GPT Image 2 prompt gallery, image prompt library, agentic skill, and CLI for OpenAI image generation/editing

Python 3,196 284 Updated May 23, 2026

Tencent-Hunyuan / UniRL

UniRL is a Framework for Unified Multimodal Model Reinforcement Learning

Python 694 43 Updated Jun 22, 2026

iie-ycx / RLSD

Code of Self-Distilled RLVR - RLSD

Python 46 1 Updated May 19, 2026

VisionOPD / Vision-OPD

Vision-OPD is a regional-to-global on-policy self-distillation framework that transfers a model's own privileged crop-conditioned perception to its full-image policy, enabling fine-grained visual u…

Python 138 4 Updated Jun 14, 2026

THU-KEG / LongTraceRL

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Python 38 Updated Jun 1, 2026

yejy53 / GenClaw

GenClaw: Code-Driven Agentic Image Generation

227 2 Updated Jun 6, 2026

SkillOpt is a text-space optimizer that trains reusable natural-language skills for frozen LLM agents through trajectory-driven edits, validation-gated updates, and deployable best_skill.md artifacts.

Python 8,843 851 Updated Jun 20, 2026

rslinfy / PrismMirror

Codebase for PrismMirror: Real-Time Human Frontal View Synthesis from a Single Image

11 Updated Mar 16, 2026

Aoko955 / Flash-VAED

[ICML 2026] Official codebase for "Flash-VAED: Plug-and-Play VAE Decoders for Efficient Video Generation"

21 Updated May 9, 2026

MeiGen-AI / GenEvolve

Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

Python 67 Updated May 22, 2026

Yanping LI Polly-LYP

Lists (26)

3D&LLM

agent

Awesome

Data selection

diffusion

diffusion LLM

efficiency

Foundation model

Framework

Generation

generation agent

LLM reasoning

LLM RL

Long context understanding

MLLM reasoning

MLLM safety

MLLM understanding

multimodal embedding

OCR

on policy distillation

other

safety

self evolving llm

steaming VLM

unified model

unified model reasoning

Stars