-
BS@UoL; RA@SJTU; Intern@Alibaba; Incoming MSc@HKU
- Beijing, China
-
20:02
(UTC +08:00) - https://willwu111.github.io
Stars
JoyAI-Echo: Pushing the Frontier of Long Audio-Visual Generation
A Minimal and Elegant Framework & Tutorial for Real-Time Interactive World Models
Xetrieval: Mechanistically Explaining Dense Retrieval
A Minimalist, Batteries-included Repository for Advancing World Model Science.
Code for Fast Training of Diffusion Models with Masked Transformers
aaaaxiaoxu / biztrack
Forked from sptin2002/biztrackBizTrack is a simple web application for small businesses, providing an intuitive order tracking system to manage finances, calculate total order income and expenses, and track profits or losses se…
[ICLR 2026] Official implementation of JavisDiT and JavisDiT++ series.
ViBe: Ultra-High-Resolution Video Synthesis Born from Pure Images
[ICCV 2025] Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
hammershock / moffee
Forked from wbopan/moffeemoffee: Make Markdown Ready to Present
A Lightweight, Configuration-Driven, Flexible Fine-Tuning Framework for 🤗 Diffusers
DreamX-World: A General-Purpose Interactive World Model
A platform for reproducible world model research and evaluation
TempoFit: Plug-and-Play Layer-Wise Temporal KV Memory for Long-Horizon Vision-Language-Action Manipulation
A comprehensive benchmark specifically designed to evaluate the interactive response capabilities of world models in 4D settings.
4-steps distilled version of Wan2.2-TI2V-5B
[AAAI-2026]FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation
Lets make video diffusion practical!
MOVA: Towards Scalable and Synchronized Video–Audio Generation
[ICML 2026] Official repository for the paper "Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention"
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
[ICLR 26 Oral] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation" & Causal Forcing++
Elevate your AI research writing, no more tedious polishing ✨
[ICML 2026] Pytorch implementation of Self-Refining Video Sampling
WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups over vLLM-optimized baselines.
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.