ChatPaper.aiChatPaper

無需訓練的瓶頸採樣擴散加速法

Training-free Diffusion Acceleration with Bottleneck Sampling

March 24, 2025
作者: Ye Tian, Xin Xia, Yuxi Ren, Shanchuan Lin, Xing Wang, Xuefeng Xiao, Yunhai Tong, Ling Yang, Bin Cui
cs.AI

摘要

擴散模型在視覺內容生成方面展現了卓越的能力,但其在推理階段的高計算成本使其部署仍具挑戰性。這一計算負擔主要源於自注意力機制相對於圖像或視頻分辨率的二次方複雜性。現有的加速方法往往會犧牲輸出質量或需要昂貴的重新訓練,而我們觀察到,大多數擴散模型是在較低分辨率下進行預訓練的,這為利用這些低分辨率先驗進行更高效的推理而不降低性能提供了機會。在本研究中,我們引入了瓶頸採樣(Bottleneck Sampling),這是一種無需訓練的框架,利用低分辨率先驗來減少計算開銷,同時保持輸出保真度。瓶頸採樣遵循高-低-高的去噪工作流程:在初始和最終階段進行高分辨率去噪,而在中間步驟則以較低分辨率操作。為了減輕混疊和模糊偽影,我們進一步細化了分辨率過渡點,並在每個階段自適應地調整去噪時間步。我們在圖像和視頻生成任務上評估了瓶頸採樣,大量實驗表明,它將圖像生成的推理速度提高了最多3倍,視頻生成提高了2.5倍,同時在多個評估指標上保持了與標準全分辨率採樣過程相當的輸出質量。代碼可在以下網址獲取:https://github.com/tyfeld/Bottleneck-Sampling。
English
Diffusion models have demonstrated remarkable capabilities in visual content generation but remain challenging to deploy due to their high computational cost during inference. This computational burden primarily arises from the quadratic complexity of self-attention with respect to image or video resolution. While existing acceleration methods often compromise output quality or necessitate costly retraining, we observe that most diffusion models are pre-trained at lower resolutions, presenting an opportunity to exploit these low-resolution priors for more efficient inference without degrading performance. In this work, we introduce Bottleneck Sampling, a training-free framework that leverages low-resolution priors to reduce computational overhead while preserving output fidelity. Bottleneck Sampling follows a high-low-high denoising workflow: it performs high-resolution denoising in the initial and final stages while operating at lower resolutions in intermediate steps. To mitigate aliasing and blurring artifacts, we further refine the resolution transition points and adaptively shift the denoising timesteps at each stage. We evaluate Bottleneck Sampling on both image and video generation tasks, where extensive experiments demonstrate that it accelerates inference by up to 3times for image generation and 2.5times for video generation, all while maintaining output quality comparable to the standard full-resolution sampling process across multiple evaluation metrics. Code is available at: https://github.com/tyfeld/Bottleneck-Sampling

Summary

AI-Generated Summary

PDF124March 25, 2025