Plug-and-Play Diffusion Distillation
Proceedings of the IEEE/CVF Conference on Computer Vision and …, 2024•openaccess.thecvf.com
Diffusion models have shown tremendous results in image generation. However due to the
iterative nature of the diffusion process and its reliance on classifier-free guidance inference
times are slow. In this paper we propose a new distillation approach for guided diffusion
models in which an external lightweight guide model is trained while the original text-to-
image model remains frozen. We show that our method reduces the inference computation
of classifier-free guided latent-space diffusion models by almost half and only requires 1 …
iterative nature of the diffusion process and its reliance on classifier-free guidance inference
times are slow. In this paper we propose a new distillation approach for guided diffusion
models in which an external lightweight guide model is trained while the original text-to-
image model remains frozen. We show that our method reduces the inference computation
of classifier-free guided latent-space diffusion models by almost half and only requires 1 …
Abstract
Diffusion models have shown tremendous results in image generation. However due to the iterative nature of the diffusion process and its reliance on classifier-free guidance inference times are slow. In this paper we propose a new distillation approach for guided diffusion models in which an external lightweight guide model is trained while the original text-to-image model remains frozen. We show that our method reduces the inference computation of classifier-free guided latent-space diffusion models by almost half and only requires 1% trainable parameters of the base model. Furthermore once trained our guide model can be applied to various fine-tuned domain-specific versions of the base diffusion model without the need for additional training: this" plug-and-play" functionality drastically improves inference computation while maintaining the visual fidelity of generated images. Empirically we show that our approach is able to produce visually appealing results and achieve a comparable FID score to the teacher with as few as 8 to 16 steps.
openaccess.thecvf.com
Показан лучший результат поиска по этому запросу. Все результаты