在有限的區間內應用指導,可以改善擴散模型中的樣本和分佈質量。
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
April 11, 2024
作者: Tuomas Kynkäänniemi, Miika Aittala, Tero Karras, Samuli Laine, Timo Aila, Jaakko Lehtinen
cs.AI
摘要
引導是從影像生成擴散模型中提取最佳性能的關鍵技術。傳統上,在影像的採樣過程中一直應用固定的引導權重。我們表明,在採樣過程的開始階段(高噪音水平)引導明顯有害,朝向結尾階段(低噪音水平)基本上是不必要的,只有在中間階段才有益。因此,我們將其限制在特定的噪音水平範圍內,提高了推論速度和結果品質。這個有限的引導間隔顯著提高了在ImageNet-512中的記錄 FID,從1.81提升至1.40。我們展示了在不同的採樣器參數、網絡架構和數據集上,包括Stable Diffusion XL的大規模設置,定量和定性上都是有益的。因此,我們建議將引導間隔作為所有使用引導的擴散模型中的一個超參數。
English
Guidance is a crucial technique for extracting the best performance out of
image-generating diffusion models. Traditionally, a constant guidance weight
has been applied throughout the sampling chain of an image. We show that
guidance is clearly harmful toward the beginning of the chain (high noise
levels), largely unnecessary toward the end (low noise levels), and only
beneficial in the middle. We thus restrict it to a specific range of noise
levels, improving both the inference speed and result quality. This limited
guidance interval improves the record FID in ImageNet-512 significantly, from
1.81 to 1.40. We show that it is quantitatively and qualitatively beneficial
across different sampler parameters, network architectures, and datasets,
including the large-scale setting of Stable Diffusion XL. We thus suggest
exposing the guidance interval as a hyperparameter in all diffusion models that
use guidance.Summary
AI-Generated Summary