ChatPaper.aiChatPaper

EM蒸馏用于一步扩散模型

EM Distillation for One-step Diffusion Models

May 27, 2024
作者: Sirui Xie, Zhisheng Xiao, Diederik P Kingma, Tingbo Hou, Ying Nian Wu, Kevin Patrick Murphy, Tim Salimans, Ben Poole, Ruiqi Gao
cs.AI

摘要

虽然扩散模型可以学习复杂的分布,但抽样需要进行计算昂贵的迭代过程。现有的蒸馏方法可以实现高效的抽样,但存在一些明显的局限,比如在非常少的抽样步骤下性能下降、依赖训练数据访问,或者寻找模式的优化可能无法捕捉到完整的分布。我们提出了EM蒸馏(EMD),这是一种基于最大似然的方法,将扩散模型蒸馏为一个一步生成器模型,而且在感知质量最小损失的情况下。我们的方法是通过期望最大化(EM)的视角推导出来的,其中生成器参数是使用来自扩散教师先验和推断生成器潜变量的联合分布的样本进行更新的。我们开发了一种重新参数化的抽样方案和一个噪声抵消技术,共同稳定了蒸馏过程。我们进一步揭示了我们的方法与现有的最小化寻找模式KL的方法之间的有趣联系。在ImageNet-64和ImageNet-128上,EMD在FID分数方面优于现有的一步生成方法,并且与先前在蒸馏文本到图像扩散模型方面的工作相比表现出色。
English
While diffusion models can learn complex distributions, sampling requires a computationally expensive iterative process. Existing distillation methods enable efficient sampling, but have notable limitations, such as performance degradation with very few sampling steps, reliance on training data access, or mode-seeking optimization that may fail to capture the full distribution. We propose EM Distillation (EMD), a maximum likelihood-based approach that distills a diffusion model to a one-step generator model with minimal loss of perceptual quality. Our approach is derived through the lens of Expectation-Maximization (EM), where the generator parameters are updated using samples from the joint distribution of the diffusion teacher prior and inferred generator latents. We develop a reparametrized sampling scheme and a noise cancellation technique that together stabilizes the distillation process. We further reveal an interesting connection of our method with existing methods that minimize mode-seeking KL. EMD outperforms existing one-step generative methods in terms of FID scores on ImageNet-64 and ImageNet-128, and compares favorably with prior work on distilling text-to-image diffusion models.

Summary

AI-Generated Summary

PDF121December 12, 2024