Skip to content
View Polly-LYP's full-sized avatar

Block or report Polly-LYP

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The code for paper "Learning from the Self-future:On-policy Self-distillation for dLLMs"

Python 6 1 Updated Jun 18, 2026

OmniAgent (ICML 2026): the first native omni-modal agent for active video perception — a 7B agent that beats Qwen2.5-VL-72B with 73% fewer frames.

Python 30 2 Updated Jun 18, 2026

Explain in Your Own Words: Improving Reasoning via Token-Selective Dual Knowledge Distillation, The Fourteenth International Conference on Learning Representations (ICLR) 2026, Accepted

Python 10 3 Updated May 9, 2026

SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting

Python 25 1 Updated Jun 22, 2026

On Policy Distillation Build on top of Verl

Python 86 6 Updated May 25, 2026

Official Implementation of Trajectory-Refined Distillation

Python 24 Updated Jun 9, 2026

Elevate your AI research writing, no more tedious polishing ✨

29,258 2,253 Updated May 18, 2026

Bash is all you need - A nano claude code–like 「agent harness」, built from 0 to 1

Python 67,950 11,052 Updated Jun 22, 2026

Academic Research Skills for Claude Code: research → write → review → revise → finalize

Python 33,757 2,776 Updated Jun 21, 2026

一个基于nano banana pro🍌的原生AI PPT生成应用,迈向"Vibe PPT"; 支持上传任意模板图片,上传任意素材&智能解析,一句话/大纲/页面描述自动生成PPT,口头修改指定区域、一键导出可编辑ppt - An AI-native slides generator based on nano banana pro🍌

Python 15,026 1,751 Updated Jun 23, 2026

MiMo Code: Where Models and Agents Co-Evolve

TypeScript 10,382 971 Updated Jun 23, 2026

Awesome List for On-Policy Distillation

670 12 Updated Jun 19, 2026

Source code of paper "RLCSD: Reinforcement Learning with Contrastive On-Policy Self-Distillation"

Python 42 1 Updated Jun 16, 2026

Official implementation of "Constitutional On-Policy Safe Distillation"

Python 4 Updated Jun 16, 2026

Official code for "Self-Distilled Agentic Reinforcement Learning"

Python 272 19 Updated May 27, 2026

PyTorch-based open-source code for paper "SOD: Step-wise On-policy Distillation for Small Language Model Agents"

Python 139 8 Updated May 22, 2026

TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents

Python 69 3 Updated Jun 11, 2026

The official implementation of our preprint paper "When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning"

Python 9 1 Updated May 23, 2026

A curated collection of papers, technical reports, frameworks, and tools for on-policy distillation (OPD) of large language models

385 8 Updated Jun 23, 2026

GPT Image 2 prompt gallery, image prompt library, agentic skill, and CLI for OpenAI image generation/editing

Python 3,196 284 Updated May 23, 2026

UniRL is a Framework for Unified Multimodal Model Reinforcement Learning

Python 694 43 Updated Jun 22, 2026

Code of Self-Distilled RLVR - RLSD

Python 46 1 Updated May 19, 2026

Vision-OPD is a regional-to-global on-policy self-distillation framework that transfers a model's own privileged crop-conditioned perception to its full-image policy, enabling fine-grained visual u…

Python 138 4 Updated Jun 14, 2026

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Python 38 Updated Jun 1, 2026

GenClaw: Code-Driven Agentic Image Generation

227 2 Updated Jun 6, 2026

SkillOpt is a text-space optimizer that trains reusable natural-language skills for frozen LLM agents through trajectory-driven edits, validation-gated updates, and deployable best_skill.md artifacts.

Python 8,843 851 Updated Jun 20, 2026

Codebase for PrismMirror: Real-Time Human Frontal View Synthesis from a Single Image

11 Updated Mar 16, 2026

[ICML 2026] Official codebase for "Flash-VAED: Plug-and-Play VAE Decoders for Efficient Video Generation"

21 Updated May 9, 2026

Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

Python 67 Updated May 22, 2026
Next