條件擴散蒸餾
Conditional Diffusion Distillation
October 2, 2023
作者: Kangfu Mei, Mauricio Delbracio, Hossein Talebi, Zhengzhong Tu, Vishal M. Patel, Peyman Milanfar
cs.AI
摘要
生成擴散模型為文本轉圖像生成提供強大的先驗知識,因此成為條件生成任務的基礎,如圖像編輯、修復和超分辨率。然而,擴散模型的一個主要限制是其較慢的採樣時間。為了應對這一挑戰,我們提出了一種新的條件蒸餾方法,旨在通過圖像條件來補充擴散先驗,從而實現僅需很少步驟的條件採樣。我們通過聯合學習直接在單個階段中蒸餾無條件預訓練,大大簡化了先前涉及分別進行蒸餾和條件微調的兩階段程序。此外,我們的方法實現了一種新的參數高效的蒸餾機制,僅通過少量額外參數與共享凍結的無條件骨幹進行任務蒸餾。跨多個任務的實驗,包括超分辨率、圖像編輯和深度轉圖像生成,證明了我們的方法在相同採樣時間下優於現有的蒸餾技術。值得注意的是,我們的方法是第一種能夠匹敵速度慢得多的精細調校條件擴散模型性能的蒸餾策略。
English
Generative diffusion models provide strong priors for text-to-image
generation and thereby serve as a foundation for conditional generation tasks
such as image editing, restoration, and super-resolution. However, one major
limitation of diffusion models is their slow sampling time. To address this
challenge, we present a novel conditional distillation method designed to
supplement the diffusion priors with the help of image conditions, allowing for
conditional sampling with very few steps. We directly distill the unconditional
pre-training in a single stage through joint-learning, largely simplifying the
previous two-stage procedures that involve both distillation and conditional
finetuning separately. Furthermore, our method enables a new
parameter-efficient distillation mechanism that distills each task with only a
small number of additional parameters combined with the shared frozen
unconditional backbone. Experiments across multiple tasks including
super-resolution, image editing, and depth-to-image generation demonstrate that
our method outperforms existing distillation techniques for the same sampling
time. Notably, our method is the first distillation strategy that can match the
performance of the much slower fine-tuned conditional diffusion models.