通过噪声感知引导缓解去噪生成模型中的噪声偏移

摘要

现有的去噪生成模型依赖于求解离散化的反向时间SDE或ODE。本文中，我们揭示了这类模型中一个长期被忽视却普遍存在的问题：预定义的噪声水平与采样过程中中间状态所编码的实际噪声水平之间的不匹配。我们将这种不匹配称为噪声偏移。通过实证分析，我们证明噪声偏移在现代扩散模型中广泛存在，并呈现出系统性偏差，导致由于分布外泛化和不准确的去噪更新而生成次优结果。为解决这一问题，我们提出了噪声感知引导（NAG），这是一种简单而有效的校正方法，明确引导采样轨迹与预定义的噪声调度保持一致。我们进一步引入了NAG的无分类器变体，通过噪声条件丢弃联合训练噪声条件模型和无噪声条件模型，从而消除了对外部分类器的需求。大量实验，包括ImageNet生成和各种监督微调任务，表明NAG能持续缓解噪声偏移，并显著提升主流扩散模型的生成质量。

English

Existing denoising generative models rely on solving discretized reverse-time SDEs or ODEs. In this paper, we identify a long-overlooked yet pervasive issue in this family of models: a misalignment between the pre-defined noise level and the actual noise level encoded in intermediate states during sampling. We refer to this misalignment as noise shift. Through empirical analysis, we demonstrate that noise shift is widespread in modern diffusion models and exhibits a systematic bias, leading to sub-optimal generation due to both out-of-distribution generalization and inaccurate denoising updates. To address this problem, we propose Noise Awareness Guidance (NAG), a simple yet effective correction method that explicitly steers sampling trajectories to remain consistent with the pre-defined noise schedule. We further introduce a classifier-free variant of NAG, which jointly trains a noise-conditional and a noise-unconditional model via noise-condition dropout, thereby eliminating the need for external classifiers. Extensive experiments, including ImageNet generation and various supervised fine-tuning tasks, show that NAG consistently mitigates noise shift and substantially improves the generation quality of mainstream diffusion models.

通过噪声感知引导缓解去噪生成模型中的噪声偏移

Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance

摘要

Support