ChatPaper.aiChatPaper

平滑能量引导:通过减小注意力曲率指导扩散模型

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention

August 1, 2024
作者: Susung Hong
cs.AI

摘要

条件扩散模型在视觉内容生成方面取得了显著成功,跨越各个领域生成高质量样本,这在很大程度上归功于无分类器指导(CFG)。最近对无条件模型扩展指导的尝试依赖于启发式技术,导致生成质量次优和产生意外效果。在这项工作中,我们提出了平滑能量指导(SEG),这是一种新颖的无需训练和条件的方法,利用自注意力机制的基于能量的视角来增强图像生成。通过定义自注意力的能量,我们引入了一种方法来减少注意力能量景观的曲率,并将输出用作无条件预测。在实践中,我们通过调整高斯核参数来控制能量景观的曲率,同时保持指导尺度参数不变。此外,我们提出了一种查询模糊方法,相当于模糊整个注意力权重,而不会导致标记数量的二次复杂度。在我们的实验中,SEG在质量和副作用减少方面实现了帕累托改进。代码可在https://github.com/SusungHong/SEG-SDXL找到。
English
Conditional diffusion models have shown remarkable success in visual content generation, producing high-quality samples across various domains, largely due to classifier-free guidance (CFG). Recent attempts to extend guidance to unconditional models have relied on heuristic techniques, resulting in suboptimal generation quality and unintended effects. In this work, we propose Smoothed Energy Guidance (SEG), a novel training- and condition-free approach that leverages the energy-based perspective of the self-attention mechanism to enhance image generation. By defining the energy of self-attention, we introduce a method to reduce the curvature of the energy landscape of attention and use the output as the unconditional prediction. Practically, we control the curvature of the energy landscape by adjusting the Gaussian kernel parameter while keeping the guidance scale parameter fixed. Additionally, we present a query blurring method that is equivalent to blurring the entire attention weights without incurring quadratic complexity in the number of tokens. In our experiments, SEG achieves a Pareto improvement in both quality and the reduction of side effects. The code is available at https://github.com/SusungHong/SEG-SDXL.

Summary

AI-Generated Summary

PDF72November 28, 2024