HiGS：基於歷史引導的採樣技術，實現擴散模型的即插即用增強

摘要

儘管擴散模型在圖像生成方面取得了顯著進展，但其輸出仍可能顯得不太真實且缺乏精細細節，尤其是在使用較少的神經函數評估（NFEs）或較低的引導尺度時。為解決這一問題，我們提出了一種新穎的基於動量的採樣技術，稱為歷史引導採樣（HiGS），該技術通過將最近的模型預測整合到每個推理步驟中，從而提升擴散採樣的質量和效率。具體而言，HiGS利用當前預測與過去預測的加權平均之間的差異，來引導採樣過程朝向更真實、細節和結構更佳的輸出。我們的方法幾乎不引入額外計算，並能無縫集成到現有的擴散框架中，既不需要額外訓練，也無需微調。大量實驗表明，HiGS在不同模型和架構下，以及在不同採樣預算和引導尺度下，均能持續提升圖像質量。此外，使用預訓練的SiT模型，HiGS在僅30個採樣步驟（而非標準的250步）下，於256x256的無引導ImageNet生成中，達到了1.61的最新FID紀錄。因此，我們將HiGS作為標準擴散採樣的即插即用增強方案，能夠實現更快且更高保真度的圖像生成。

English

While diffusion models have made remarkable progress in image generation, their outputs can still appear unrealistic and lack fine details, especially when using fewer number of neural function evaluations (NFEs) or lower guidance scales. To address this issue, we propose a novel momentum-based sampling technique, termed history-guided sampling (HiGS), which enhances quality and efficiency of diffusion sampling by integrating recent model predictions into each inference step. Specifically, HiGS leverages the difference between the current prediction and a weighted average of past predictions to steer the sampling process toward more realistic outputs with better details and structure. Our approach introduces practically no additional computation and integrates seamlessly into existing diffusion frameworks, requiring neither extra training nor fine-tuning. Extensive experiments show that HiGS consistently improves image quality across diverse models and architectures and under varying sampling budgets and guidance scales. Moreover, using a pretrained SiT model, HiGS achieves a new state-of-the-art FID of 1.61 for unguided ImageNet generation at 256times256 with only 30 sampling steps (instead of the standard 250). We thus present HiGS as a plug-and-play enhancement to standard diffusion sampling that enables faster generation with higher fidelity.

HiGS：基於歷史引導的採樣技術，實現擴散模型的即插即用增強

HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models

摘要

Support