ChatPaper.aiChatPaper

在有限区间内应用指导,可以改善扩散模型中的样本和分布质量。

Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models

April 11, 2024
作者: Tuomas Kynkäänniemi, Miika Aittala, Tero Karras, Samuli Laine, Timo Aila, Jaakko Lehtinen
cs.AI

摘要

引导是从图像生成扩散模型中提取最佳性能的关键技术。传统上,在图像的采样链中始终应用恒定的引导权重。我们表明,在链的开始阶段(高噪声水平)引导明显有害,末端(低噪声水平)基本不需要,只有在中间阶段才有益。因此,我们将其限制在特定噪声水平范围内,提高了推断速度和结果质量。这种有限的引导间隔显著提高了ImageNet-512中的记录FID,从1.81提高到1.40。我们展示了在不同采样器参数、网络架构和数据集上,包括Stable Diffusion XL的大规模设置中,定量和定性上都有益。因此,我们建议将引导间隔作为所有使用引导的扩散模型的超参数。
English
Guidance is a crucial technique for extracting the best performance out of image-generating diffusion models. Traditionally, a constant guidance weight has been applied throughout the sampling chain of an image. We show that guidance is clearly harmful toward the beginning of the chain (high noise levels), largely unnecessary toward the end (low noise levels), and only beneficial in the middle. We thus restrict it to a specific range of noise levels, improving both the inference speed and result quality. This limited guidance interval improves the record FID in ImageNet-512 significantly, from 1.81 to 1.40. We show that it is quantitatively and qualitatively beneficial across different sampler parameters, network architectures, and datasets, including the large-scale setting of Stable Diffusion XL. We thus suggest exposing the guidance interval as a hyperparameter in all diffusion models that use guidance.

Summary

AI-Generated Summary

PDF141December 15, 2024