LightCache:面向视频生成的内存高效、免训练加速方案
LightCache: Memory-Efficient, Training-Free Acceleration for Video Generation
October 6, 2025
作者: Yang Xiao, Gen Li, Kaiyuan Deng, Yushu Wu, Zheng Zhan, Yanzhi Wang, Xiaolong Ma, Bo Hui
cs.AI
摘要
无需训练的加速技术已成为基于扩散模型的视频生成领域中的前沿研究方向。扩散模型推理过程中潜在变量的冗余性为加速提供了天然的切入点。本文中,我们将推理过程分解为编码、去噪和解码三个阶段,并观察到基于缓存的加速方法往往会在后两个阶段导致显著的内存激增。针对这一问题,我们分析了不同阶段推理的特性,并提出了针对性的内存优化策略:1)异步缓存交换;2)特征分块;3)切片解码潜在变量。同时,我们确保这三种策略引入的时间开销低于其带来的加速收益。与基线方法相比,我们的方法在保持质量退化在可接受范围内的同时,实现了更快的推理速度和更低的内存占用。代码已开源,详见 https://github.com/NKUShaw/LightCache。
English
Training-free acceleration has emerged as an advanced research area in video
generation based on diffusion models. The redundancy of latents in diffusion
model inference provides a natural entry point for acceleration. In this paper,
we decompose the inference process into the encoding, denoising, and decoding
stages, and observe that cache-based acceleration methods often lead to
substantial memory surges in the latter two stages. To address this problem, we
analyze the characteristics of inference across different stages and propose
stage-specific strategies for reducing memory consumption: 1) Asynchronous
Cache Swapping. 2) Feature chunk. 3) Slicing latents to decode. At the same
time, we ensure that the time overhead introduced by these three strategies
remains lower than the acceleration gains themselves. Compared with the
baseline, our approach achieves faster inference speed and lower memory usage,
while maintaining quality degradation within an acceptable range. The Code is
available at https://github.com/NKUShaw/LightCache .