DeepCache:加速擴散模型的自由化
DeepCache: Accelerating Diffusion Models for Free
December 1, 2023
作者: Xinyin Ma, Gongfan Fang, Xinchao Wang
cs.AI
摘要
最近,擴散模型在影像合成領域引起了前所未有的關注,這歸因於其卓越的生成能力。儘管這些模型強大,但往往需要大量計算成本,主要是由於順序去噪過程和龐大的模型尺寸所致。傳統的擴散模型壓縮方法通常涉及大量的重新訓練,帶來成本和可行性方面的挑戰。在本文中,我們介紹了DeepCache,這是一種全新的無需訓練的範式,從模型架構的角度加速擴散模型。DeepCache利用擴散模型順序去噪步驟中觀察到的固有時間冗餘,對相鄰去噪階段之間的特徵進行緩存和檢索,從而減少冗餘計算。利用U-Net的特性,我們以一種非常便宜的方式重複使用高級特徵,同時更新低級特徵。這種創新策略進而實現了對Stable Diffusion v1.5的2.3倍加速,僅使CLIP Score下降0.05,以及對LDM-4-G的4.1倍加速,使ImageNet上的FID略微下降0.22。我們的實驗還表明,DeepCache優於現有的剪枝和蒸餾方法,這些方法需要重新訓練,並且與當前的採樣技術相容。此外,我們發現,在相同的吞吐量下,DeepCache有效地實現了與DDIM或PLMS相當甚至稍微改善的結果。代碼可在https://github.com/horseee/DeepCache找到。
English
Diffusion models have recently gained unprecedented attention in the field of
image synthesis due to their remarkable generative capabilities.
Notwithstanding their prowess, these models often incur substantial
computational costs, primarily attributed to the sequential denoising process
and cumbersome model size. Traditional methods for compressing diffusion models
typically involve extensive retraining, presenting cost and feasibility
challenges. In this paper, we introduce DeepCache, a novel training-free
paradigm that accelerates diffusion models from the perspective of model
architecture. DeepCache capitalizes on the inherent temporal redundancy
observed in the sequential denoising steps of diffusion models, which caches
and retrieves features across adjacent denoising stages, thereby curtailing
redundant computations. Utilizing the property of the U-Net, we reuse the
high-level features while updating the low-level features in a very cheap way.
This innovative strategy, in turn, enables a speedup factor of 2.3times for
Stable Diffusion v1.5 with only a 0.05 decline in CLIP Score, and 4.1times
for LDM-4-G with a slight decrease of 0.22 in FID on ImageNet. Our experiments
also demonstrate DeepCache's superiority over existing pruning and distillation
methods that necessitate retraining and its compatibility with current sampling
techniques. Furthermore, we find that under the same throughput, DeepCache
effectively achieves comparable or even marginally improved results with DDIM
or PLMS. The code is available at https://github.com/horseee/DeepCache