條件擴散蒸餾

摘要

生成擴散模型為文本轉圖像生成提供強大的先驗知識，因此成為條件生成任務的基礎，如圖像編輯、修復和超分辨率。然而，擴散模型的一個主要限制是其較慢的採樣時間。為了應對這一挑戰，我們提出了一種新的條件蒸餾方法，旨在通過圖像條件來補充擴散先驗，從而實現僅需很少步驟的條件採樣。我們通過聯合學習直接在單個階段中蒸餾無條件預訓練，大大簡化了先前涉及分別進行蒸餾和條件微調的兩階段程序。此外，我們的方法實現了一種新的參數高效的蒸餾機制，僅通過少量額外參數與共享凍結的無條件骨幹進行任務蒸餾。跨多個任務的實驗，包括超分辨率、圖像編輯和深度轉圖像生成，證明了我們的方法在相同採樣時間下優於現有的蒸餾技術。值得注意的是，我們的方法是第一種能夠匹敵速度慢得多的精細調校條件擴散模型性能的蒸餾策略。

English

Generative diffusion models provide strong priors for text-to-image generation and thereby serve as a foundation for conditional generation tasks such as image editing, restoration, and super-resolution. However, one major limitation of diffusion models is their slow sampling time. To address this challenge, we present a novel conditional distillation method designed to supplement the diffusion priors with the help of image conditions, allowing for conditional sampling with very few steps. We directly distill the unconditional pre-training in a single stage through joint-learning, largely simplifying the previous two-stage procedures that involve both distillation and conditional finetuning separately. Furthermore, our method enables a new parameter-efficient distillation mechanism that distills each task with only a small number of additional parameters combined with the shared frozen unconditional backbone. Experiments across multiple tasks including super-resolution, image editing, and depth-to-image generation demonstrate that our method outperforms existing distillation techniques for the same sampling time. Notably, our method is the first distillation strategy that can match the performance of the much slower fine-tuned conditional diffusion models.

條件擴散蒸餾

Conditional Diffusion Distillation

摘要

Support