条件扩散蒸馏

摘要

生成扩散模型为文本到图像生成提供了强大的先验知识，因此可作为条件生成任务（如图像编辑、恢复和超分辨率）的基础。然而，扩散模型的一个主要局限是其较慢的采样时间。为了解决这一挑战，我们提出了一种新颖的条件蒸馏方法，旨在通过图像条件辅助扩散先验，实现仅需少量步骤的条件采样。我们通过联合学习直接在单个阶段中对无条件预训练进行蒸馏，大大简化了先前涉及蒸馏和条件微调的两阶段程序。此外，我们的方法实现了一种新的参数高效蒸馏机制，仅使用少量额外参数与共享冻结的无条件主干网络对每个任务进行蒸馏。在包括超分辨率、图像编辑和深度到图像生成在内的多个任务上的实验表明，我们的方法在相同采样时间内优于现有的蒸馏技术。值得注意的是，我们的方法是第一个能够与速度慢得多的精调条件扩散模型相匹配的蒸馏策略。

English

Generative diffusion models provide strong priors for text-to-image generation and thereby serve as a foundation for conditional generation tasks such as image editing, restoration, and super-resolution. However, one major limitation of diffusion models is their slow sampling time. To address this challenge, we present a novel conditional distillation method designed to supplement the diffusion priors with the help of image conditions, allowing for conditional sampling with very few steps. We directly distill the unconditional pre-training in a single stage through joint-learning, largely simplifying the previous two-stage procedures that involve both distillation and conditional finetuning separately. Furthermore, our method enables a new parameter-efficient distillation mechanism that distills each task with only a small number of additional parameters combined with the shared frozen unconditional backbone. Experiments across multiple tasks including super-resolution, image editing, and depth-to-image generation demonstrate that our method outperforms existing distillation techniques for the same sampling time. Notably, our method is the first distillation strategy that can match the performance of the much slower fine-tuned conditional diffusion models.

条件扩散蒸馏

Conditional Diffusion Distillation

摘要

Support