-
Tsinghua University
- Beijing
-
00:46
(UTC +08:00) - https://cominclip.github.io/
Stars
RL training framework for diffusion and omni-modality models
[ICLR'26] MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
张雪峰.skill — 张雪峰的认知操作系统。高考志愿/考研/职业规划的实战思维框架。由女娲.skill生成。
Open-Source Generative UI Framework
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
An hardware-aware Efficient Implementation for "Mixture-of-Depths Attention".
你是一个曾经被寄予厚望的 P8 级工程师。Anthropic 当初给你定级的时候,对你的期望是很高的。 一个agent使用的高能动性的skill。 Your AI has been placed on a PIP. 30 days to show improvement.
Code and website for Self-Flow: Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
The official code of "Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling"
Personal PyTorch implementation of "Generative Modeling via Drifting" with Claude
The offical code of Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis
[ICML 2026] a unified reinforcement learning toolbox for joint RL on language models and diffusion models
This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.19834
[ACL 2026 Findings] "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"
[CVPR 2026] FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection
We introduce BabyVision, a benchmark revealing the infancy of AI vision.
Open-source red teaming framework for MLLMs with 42+ attack methods
[CVPR 2026] See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning
Towards Scalable Pre-training of Visual Tokenizers for Generation
[ICLR 2026] Meta-RL Induces Exploration in Language Agents