无需训练的瓶颈采样扩散加速法
Training-free Diffusion Acceleration with Bottleneck Sampling
March 24, 2025
作者: Ye Tian, Xin Xia, Yuxi Ren, Shanchuan Lin, Xing Wang, Xuefeng Xiao, Yunhai Tong, Ling Yang, Bin Cui
cs.AI
摘要
扩散模型在视觉内容生成方面展现了卓越的能力,但由于推理过程中的高计算成本,其部署仍面临挑战。这一计算负担主要源于自注意力机制相对于图像或视频分辨率的二次方复杂度。尽管现有的加速方法往往以牺牲输出质量为代价或需要昂贵的重新训练,我们注意到大多数扩散模型是在较低分辨率下预训练的,这为利用这些低分辨率先验知识进行更高效的推理而不降低性能提供了机会。在本研究中,我们引入了瓶颈采样(Bottleneck Sampling),这是一个无需训练的框架,它利用低分辨率先验知识来减少计算开销,同时保持输出保真度。瓶颈采样遵循高-低-高的去噪工作流程:在初始和最终阶段执行高分辨率去噪,而在中间步骤则以较低分辨率操作。为了减轻混叠和模糊伪影,我们进一步优化了分辨率转换点,并在每个阶段自适应地调整去噪时间步长。我们在图像和视频生成任务上评估了瓶颈采样,大量实验表明,它在图像生成上加速推理高达3倍,在视频生成上加速高达2.5倍,同时在多个评估指标上保持与标准全分辨率采样过程相当的输出质量。代码可在以下网址获取:https://github.com/tyfeld/Bottleneck-Sampling。
English
Diffusion models have demonstrated remarkable capabilities in visual content
generation but remain challenging to deploy due to their high computational
cost during inference. This computational burden primarily arises from the
quadratic complexity of self-attention with respect to image or video
resolution. While existing acceleration methods often compromise output quality
or necessitate costly retraining, we observe that most diffusion models are
pre-trained at lower resolutions, presenting an opportunity to exploit these
low-resolution priors for more efficient inference without degrading
performance. In this work, we introduce Bottleneck Sampling, a training-free
framework that leverages low-resolution priors to reduce computational overhead
while preserving output fidelity. Bottleneck Sampling follows a high-low-high
denoising workflow: it performs high-resolution denoising in the initial and
final stages while operating at lower resolutions in intermediate steps. To
mitigate aliasing and blurring artifacts, we further refine the resolution
transition points and adaptively shift the denoising timesteps at each stage.
We evaluate Bottleneck Sampling on both image and video generation tasks, where
extensive experiments demonstrate that it accelerates inference by up to
3times for image generation and 2.5times for video generation, all while
maintaining output quality comparable to the standard full-resolution sampling
process across multiple evaluation metrics. Code is available at:
https://github.com/tyfeld/Bottleneck-SamplingSummary
AI-Generated Summary