Yunfeng Wu

I'm a senior student at Xi'an Jiaotong-Liverpool University (XJTLU) in Suzhou, China, majoring in Information and Computing Science. I'll be pursuing an M.Sc. in Computer Science at The University of Hong Kong (HKU). I'm interested in computer vision and generative AI. Currently, I'm conducting research on video generation and world model.

Email / Scholar / Github

Publications

ViBe: Ultra-High-Resolution Video Synthesis Born from Pure Images
Y. Wu, H. Cheng, Z. He, S. Liu
paper / code

Naively fine-tuning by single LoRA with high-resolution images introduces noise and artifacts. We introduce Relay-LoRA, a two-stage fine-tuning method that reduces noise and enhances visual detail.

FreeSwim: Revisiting Sliding-Window Attention Mechanisms for Training-Free Ultra-High-Resolution Video Generation
Y. Wu, J. Song, Z. Tan, Z. He, S. Liu
paper / code

We identify the root cause of degradation at high resolutions and propose an efficient Flex-Attention-based interpolation window masking mechanism for seamless 4K video generation.

MitPose: Multi-Granularity Guided Vision Transformer for Human Pose Estimation
Y. Wu, Q. Gao, Y. Liu, J. Sun, Z. Li, Y. Jin, Y. Yue, X. Zhu
INDIN, 2025
paper / code

We introduce an innovative over-parameterized convolution and global-attention complementary mechanism for multi-granularity feature representation, achieving SOTA performance on COCO and MPII benchmarks.

Intern Experiences

	Alibaba Group Research Intern \| Supervised by Xiangxiang Chu Research Direction: World Model
	Shanghai Jiao Tong University Research Intern \| Supervised by Songhua Liu Research Direction: Video Generation
	Xi'an Jiaotong-Liverpool University Research Assistant \| Supervised by Yong Yue Research Direction: Human Pose Estimation

Feel free to steal this website's source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.