ChatPaper.aiChatPaper

尺度感知的擴散模型蒸餾

Scale-wise Distillation of Diffusion Models

March 20, 2025
作者: Nikita Starodubcev, Denis Kuznedelev, Artem Babenko, Dmitry Baranchuk
cs.AI

摘要

我們提出了SwD,這是一個針對擴散模型(DMs)的尺度級蒸餾框架,它有效地利用了下一尺度預測的概念來實現基於擴散的少步生成器。具體而言,SwD的靈感來自於近期將擴散過程與隱式頻譜自回歸相聯繫的洞見。我們假設DMs可以在較低的數據分辨率下啟動生成過程,並在每個去噪步驟中逐步提升樣本的分辨率,而不會損失性能,同時顯著降低計算成本。SwD自然地將這一理念整合到基於分佈匹配的現有擴散蒸餾方法中。此外,我們通過引入一種新穎的補丁損失來豐富分佈匹配方法家族,該損失強制實現與目標分佈更細粒度的相似性。當應用於最先進的文本到圖像擴散模型時,SwD接近於兩個全分辨率步驟的推理時間,並在相同的計算預算下顯著優於同類方法,這得到了自動化指標和人類偏好研究的證實。
English
We present SwD, a scale-wise distillation framework for diffusion models (DMs), which effectively employs next-scale prediction ideas for diffusion-based few-step generators. In more detail, SwD is inspired by the recent insights relating diffusion processes to the implicit spectral autoregression. We suppose that DMs can initiate generation at lower data resolutions and gradually upscale the samples at each denoising step without loss in performance while significantly reducing computational costs. SwD naturally integrates this idea into existing diffusion distillation methods based on distribution matching. Also, we enrich the family of distribution matching approaches by introducing a novel patch loss enforcing finer-grained similarity to the target distribution. When applied to state-of-the-art text-to-image diffusion models, SwD approaches the inference times of two full resolution steps and significantly outperforms the counterparts under the same computation budget, as evidenced by automated metrics and human preference studies.

Summary

AI-Generated Summary

PDF404March 21, 2025