CVPR 2026
- A simple but effective insight: in one-step diffusion distillation, weight direction matters much more than weight norm.
- We propose LoRaD (Low-rank Ratation of weight Direction), a parameter-efficient adapter for modeling structured directional changes.
- We build WaDi, a weight direction-aware one-step distillation framework based on VSD.
- WaDi achieves state-of-the-art FID on COCO 2014 and COCO 2017.
- WaDi uses only ~10% trainable parameters of the original U-Net / DiT.
- The distilled one-step model generalizes well to controllable generation, relation inversion, and high-resolution synthesis.
Motivational analysis of WaDi. (a) Differences in weight norm and direction between the one-step student and the teacher model. (b) SVD analysis of the residual matrix for DMD2. (c) Replacing the one-step model's norm with that of the multi-step model has little effect, while replacing the direction severely degrades generation quality. (d) Qualitative examples corresponding to (c). (e) Illustration of LoRaD.
Diffusion models such as Stable Diffusion achieve impressive image generation quality, but their multi-step inference is still expensive for practical deployment. Recent works aim to accelerate inference by distilling multi-step diffusion models into one-step generators.
To better understand the distillation mechanism, we analyze the weight changes between one-step students and their multi-step teacher counterparts in both U-Net and DiT models. Our analysis shows that directional changes in weights are significantly larger and more important than norm changes during one-step distillation.
Motivated by this finding, we propose LoRaD (Low-rank Ratation of weight Direction), a lightweight adapter that models structured directional changes using learnable low-rank rotation matrices. We further integrate LoRaD into Variational Score Distillation (VSD) and build WaDi, a novel one-step distillation framework.
WaDi achieves state-of-the-art FID on COCO 2014 and COCO 2017 while using only ~10% of the trainable parameters of the original U-Net / DiT. In addition, the distilled one-step model remains versatile and scalable, supporting downstream applications such as controllable generation, relation inversion, and high-resolution synthesis.
Left: architecture of the LoRaD module, which rotates pretrained weight directions using learnable low-rank rotation angles. Right: overview of the WaDi framework.
| Model | Architecture | Hugging Face | ModelScope |
|---|---|---|---|
| WaDi-SD2.1 | U-Net | ||
| WaDi-SD1.5 | U-Net | ||
| WaDi-PixArt | DiT |
git clone https://github.com/gudaochangsheng/WaDi.git
cd WaDi
conda create -n wadi python=3.8 -y
conda activate wadi
pip install -r requirements.txt# Train WaDi on Stable Diffusion 1.5
bash train_dkd_sd1.5.sh
# Train WaDi on Stable Diffusion 2.1
bash train_dkd_sd2.1.sh
# Train WaDi on PixArt-alpha
bash train_dkd_pixart.sh# Inference for Stable Diffusion models
python infer_sd_model.py
# Inference for PixArt-alpha
python infer_pixart.pyIf you find WaDi useful, please consider giving this repository a star ⭐ and citing our paper.
@article{wang2026wadi,
title={WaDi: Weight Direction-aware Distillation for One-step Image Synthesis},
author={Wang, Lei and Cheng, Yang and Li, Senmao and Wu, Ge and Wang, Yaxing and Yang, Jian},
journal={arXiv preprint arXiv:2603.08258},
year={2026}
}
@inproceedings{li2025one,
title={One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models},
author={Li, Senmao and Wang, Lei and Wang, Kai and Liu, Tao and Xie, Jiehang and van de Weijer, Joost and Khan, Fahad Shahbaz and Yang, Shiqi and Wang, Yaxing and Yang, Jian},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2025}
}This project is built upon Diffusers.
We also sincerely acknowledge the inspiring prior work:
TiUE and SwiftBrush.
If you have any questions, please feel free to contact:
scitop1998@gmail.com