Skip to content
View Yu-Fangxu's full-sized avatar

Highlights

  • Pro

Organizations

@tianyi-lab

Block or report Yu-Fangxu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated collection of papers, technical reports, frameworks, and tools for on-policy distillation (OPD) of large language models

392 8 Updated Jun 23, 2026

Awesome List for On-Policy Distillation

670 12 Updated Jun 19, 2026

A curated collection of papers and resources on On-Policy Distillation for Large Language Models.

Python 352 6 Updated Jun 21, 2026

A curated list of resources (surveys, papers, benchmarks, and opensource projects) on Rubrics

90 3 Updated Jun 15, 2026

SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?

Python 452 83 Updated May 18, 2026

SenseNova-U series: Native Unified Paradigm with NEO-unify from the First Principles

Python 3,336 292 Updated Jun 15, 2026

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Python 696 44 Updated May 30, 2026

Paper list of agent for science

269 23 Updated Mar 12, 2026

[ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"

Python 68 6 Updated Apr 3, 2026

[Findings of ACL 2026] ArrowGEV: Grounding Events in Video via Learning the Arrow of Time

Python 4 Updated Apr 19, 2026

MiroEval: A benchmark and evaluation framework for deep research agents — 100 tasks (70 text, 30 multimodal) assessed across synthesis quality, factuality, and research process. 13 systems evaluated.

Python 43 7 Updated Apr 6, 2026

Awesome Unified Multimodal Models

1,286 40 Updated Mar 24, 2026

A unified multimodal model toolkit

Python 131 9 Updated May 18, 2026

[ICML 2026] XSkill: Continual Learning from Experience and Skills in Multimodal Agents

Python 225 27 Updated May 13, 2026

Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"

Python 240 13 Updated May 28, 2026

Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

Python 255 21 Updated Apr 13, 2026

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Python 850 64 Updated May 17, 2026

Automatic Environment Generation with Evolving Coding Agent for Embodied Agent Learning

Python 128 17 Updated Jun 23, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 380,051 79,583 Updated Jun 23, 2026

[ICLR 2025] "GraphRouter: A Graph-based Router for LLM Selections", Tao Feng, Yanzhen Shen, Jiaxuan You

Python 73 7 Updated Dec 30, 2025

Qwen3.6 is the large language model series developed by Qwen team, Alibaba Group.

3,608 241 Updated Jun 3, 2026

ICLR 2026 (Oral) | EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning

Python 54 4 Updated Feb 12, 2026
3 Updated Jan 30, 2026
Python 4,534 491 Updated Apr 22, 2026

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Python 787 79 Updated Jun 10, 2026

DeliveryBench: Can Agents Earn Profit in Real World?

Python 18 1 Updated Feb 11, 2026
Python 28 1 Updated Jan 31, 2026
Python 48 4 Updated Jan 30, 2026

[ICML 2026] Multimodal deep-research MLLM and benchmark. The first long-horizon multimodal deep-research MLLM, extending the number of reasoning turns to dozens and the number of search-engine inte…

Python 648 56 Updated Jun 8, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 967 109 Updated Feb 18, 2026
Next