ChatPaper.aiChatPaper

S^2-引導:基於隨機自引導的無訓練增強擴散模型方法

S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models

August 18, 2025
作者: Chubin Chen, Jiashu Zhu, Xiaokun Feng, Nisha Huang, Meiqi Wu, Fangyuan Mao, Jiahong Wu, Xiangxiang Chu, Xiu Li
cs.AI

摘要

無分類器指導(Classifier-free Guidance, CFG)是現代擴散模型中廣泛使用的一項技術,旨在提升樣本質量與提示符的遵循度。然而,通過對具有閉式解的高斯混合模型進行實證分析,我們觀察到CFG產生的次優結果與真實情況之間存在差異。模型對這些次優預測的過度依賴,常常導致語義不連貫與低質量的輸出。為解決這一問題,我們首先實證展示了模型自身的子網絡能夠有效精煉這些次優預測。基於這一洞察,我們提出了S^2-Guidance,這是一種新穎的方法,它在前向過程中利用隨機塊丟棄來構建隨機子網絡,從而有效引導模型遠離潛在的低質量預測,朝向高質量輸出邁進。在文本到圖像與文本到視頻生成任務上的大量定性與定量實驗表明,S^2-Guidance展現出卓越的性能,持續超越CFG及其他先進的指導策略。我們的代碼將予以公開。
English
Classifier-free Guidance (CFG) is a widely used technique in modern diffusion models for enhancing sample quality and prompt adherence. However, through an empirical analysis on Gaussian mixture modeling with a closed-form solution, we observe a discrepancy between the suboptimal results produced by CFG and the ground truth. The model's excessive reliance on these suboptimal predictions often leads to semantic incoherence and low-quality outputs. To address this issue, we first empirically demonstrate that the model's suboptimal predictions can be effectively refined using sub-networks of the model itself. Building on this insight, we propose S^2-Guidance, a novel method that leverages stochastic block-dropping during the forward process to construct stochastic sub-networks, effectively guiding the model away from potential low-quality predictions and toward high-quality outputs. Extensive qualitative and quantitative experiments on text-to-image and text-to-video generation tasks demonstrate that S^2-Guidance delivers superior performance, consistently surpassing CFG and other advanced guidance strategies. Our code will be released.
PDF112August 19, 2025