LightCache：面向視頻生成的記憶體高效、無需訓練的加速方案

摘要

無需訓練的加速技術已成為基於擴散模型的視頻生成領域中的一個前沿研究方向。擴散模型推理過程中潛在變量的冗餘性為加速提供了天然的切入點。本文將推理過程分解為編碼、去噪和解碼三個階段，並觀察到基於緩存的加速方法往往會導致後兩個階段的內存大幅增加。為解決這一問題，我們分析了不同階段推理的特點，並提出了針對各階段減少內存消耗的策略：1）異步緩存交換；2）特徵分塊；3）切片潛在變量進行解碼。同時，我們確保這三種策略引入的時間開銷低於其帶來的加速收益。與基準方法相比，我們的方法實現了更快的推理速度和更低的內存使用，同時將質量下降控制在可接受範圍內。代碼已開源於 https://github.com/NKUShaw/LightCache。

English

Training-free acceleration has emerged as an advanced research area in video generation based on diffusion models. The redundancy of latents in diffusion model inference provides a natural entry point for acceleration. In this paper, we decompose the inference process into the encoding, denoising, and decoding stages, and observe that cache-based acceleration methods often lead to substantial memory surges in the latter two stages. To address this problem, we analyze the characteristics of inference across different stages and propose stage-specific strategies for reducing memory consumption: 1) Asynchronous Cache Swapping. 2) Feature chunk. 3) Slicing latents to decode. At the same time, we ensure that the time overhead introduced by these three strategies remains lower than the acceleration gains themselves. Compared with the baseline, our approach achieves faster inference speed and lower memory usage, while maintaining quality degradation within an acceptable range. The Code is available at https://github.com/NKUShaw/LightCache .

LightCache：面向視頻生成的記憶體高效、無需訓練的加速方案

LightCache: Memory-Efficient, Training-Free Acceleration for Video Generation

摘要

Support