ChatPaper.aiChatPaper

平滑能量引導:通過降低注意力的能量曲率引導擴散模型

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention

August 1, 2024
作者: Susung Hong
cs.AI

摘要

條件擴散模型在視覺內容生成方面取得了顯著成功,跨越各個領域生成高質量樣本,這在很大程度上歸因於無分類器引導(CFG)。最近對無條件模型擴展引導的嘗試依賴於啟發式技術,導致生成質量次優和意外效應。在本研究中,我們提出了平滑能量引導(SEG),這是一種新穎的無需訓練和條件的方法,利用自注意機制的基於能量的觀點來增強圖像生成。通過定義自注意的能量,我們引入了一種方法來減少注意的能量景觀的曲率,並將輸出用作無條件預測。在實踐中,我們通過調整高斯核參數來控制能量景觀的曲率,同時保持引導尺度參數不變。此外,我們提出了一種查詢模糊方法,相當於模糊整個注意權重,而不會使標記數量的二次複雜度增加。在我們的實驗中,SEG在質量和副作用減少方面實現了帕累托改進。代碼可在https://github.com/SusungHong/SEG-SDXL 找到。
English
Conditional diffusion models have shown remarkable success in visual content generation, producing high-quality samples across various domains, largely due to classifier-free guidance (CFG). Recent attempts to extend guidance to unconditional models have relied on heuristic techniques, resulting in suboptimal generation quality and unintended effects. In this work, we propose Smoothed Energy Guidance (SEG), a novel training- and condition-free approach that leverages the energy-based perspective of the self-attention mechanism to enhance image generation. By defining the energy of self-attention, we introduce a method to reduce the curvature of the energy landscape of attention and use the output as the unconditional prediction. Practically, we control the curvature of the energy landscape by adjusting the Gaussian kernel parameter while keeping the guidance scale parameter fixed. Additionally, we present a query blurring method that is equivalent to blurring the entire attention weights without incurring quadratic complexity in the number of tokens. In our experiments, SEG achieves a Pareto improvement in both quality and the reduction of side effects. The code is available at https://github.com/SusungHong/SEG-SDXL.

Summary

AI-Generated Summary

PDF72November 28, 2024