-
HKUST, Guangzhou
-
08:47
(UTC -12:00) - https://owen718.github.io
- https://scholar.google.com/citations?user=1sGXZ-wAAAAJ&hl=en
Stars
Claude Code-style recurring loop scheduling for OpenAI Codex.
SenseNova-U series: Native Unified Paradigm with NEO-Unify from the First Principles
将博导十年科研经验炼化为可直接调用的 AI 技能。从 Idea 构思到论文投稿,你的 AI 科研副导师。
Hermes WebUI: The best way to use Hermes Agent from the web or from your phone!
[CVPR2026] PosterReward: Unlocking Accurate Evaluation for High-Quality Graphic Design Generation
Try X-Dub to sync any character in a video with any audio you like | Official repository for "From Inpainting to Editing: Unlocking Robust Mask-Free Visual Dubbing via Generative Bootstrapping"
[Ultra Powerful Few-Step Diffusion RL] TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
Build your own Claude Code-like agent from scratch, step by step, with AI assistance.
[CVPR2026] PosterOmni: One model for poster creation—unifying local edits and global design for generalized multi-task image/poster-to-poster generation.
Streaming Flux editor: live camera→ editing every frames at interactive FPS based on FLUX.2-Klein-4B. Runs on a single H100 at 15+ FPS
Official Code of "Distribution Matching Distillation Meets Reinforcement Learning"
Data and sample evaluation codes for Multimodal Rewardbench 2
[ICLR 26 Oral] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
[Tutorial] Few-Step Distillation for Text-to-Image Generation: A Practical Guide
Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios,try it in comfyUI
iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios
[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation
LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer, ICLR 2026
[ICLR 2026] pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation
A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.
Lynx: Towards High-Fidelity Personalized Video Generation
[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models
MoCha: End-to-End Video Character Replacement without Structural Guidance
Identity-GRPO: Optimizing Multi-Human Identity-preserving Video Generation via Reinforcement Learning
LucidFlux: Caption-Free Universal Image Restoration with a Large-Scale Diffusion Transformer,you can use it in ComfyUI
VideoNSA: Native Sparse Attention Scales Video Understanding