进化缓存技术加速您的现成扩散模型

摘要

基于扩散的图像生成模型在生成高质量合成内容方面表现出色，但其推理过程缓慢且计算成本高昂。先前的研究尝试通过在扩散变换器内部跨推理步骤缓存和重用特征来缓解这一问题。然而，这些方法通常依赖于僵化的启发式规则，导致加速效果有限或在不同架构间泛化能力差。我们提出了进化缓存加速扩散模型（ECAD），这是一种遗传算法，仅需少量校准提示即可学习形成帕累托前沿的高效、针对特定模型的缓存调度方案。ECAD无需修改网络参数或参考图像，即可显著提升推理速度，实现对质量与延迟权衡的精细控制，并能无缝适应不同的扩散模型。值得注意的是，ECAD学习到的调度方案能够有效泛化至校准过程中未见的分辨率和模型变体。我们在PixArt-alpha、PixArt-Sigma和FLUX-1.dev上使用多种指标（FID、CLIP、图像奖励）在多样化基准（COCO、MJHQ-30k、PartiPrompts）上评估了ECAD，结果显示其相较于以往方法取得了持续改进。在PixArt-alpha上，ECAD找到的调度方案在COCO FID上比之前的最优方法提升了4.47，同时将推理加速比从2.35倍提高至2.58倍。我们的成果确立了ECAD作为一种可扩展且泛化性强的扩散推理加速方法。项目网站位于https://aniaggarwal.github.io/ecad，代码开源在https://github.com/aniaggarwal/ecad。

English

Diffusion-based image generation models excel at producing high-quality synthetic content, but suffer from slow and computationally expensive inference. Prior work has attempted to mitigate this by caching and reusing features within diffusion transformers across inference steps. These methods, however, often rely on rigid heuristics that result in limited acceleration or poor generalization across architectures. We propose Evolutionary Caching to Accelerate Diffusion models (ECAD), a genetic algorithm that learns efficient, per-model, caching schedules forming a Pareto frontier, using only a small set of calibration prompts. ECAD requires no modifications to network parameters or reference images. It offers significant inference speedups, enables fine-grained control over the quality-latency trade-off, and adapts seamlessly to different diffusion models. Notably, ECAD's learned schedules can generalize effectively to resolutions and model variants not seen during calibration. We evaluate ECAD on PixArt-alpha, PixArt-Sigma, and FLUX-1.dev using multiple metrics (FID, CLIP, Image Reward) across diverse benchmarks (COCO, MJHQ-30k, PartiPrompts), demonstrating consistent improvements over previous approaches. On PixArt-alpha, ECAD identifies a schedule that outperforms the previous state-of-the-art method by 4.47 COCO FID while increasing inference speedup from 2.35x to 2.58x. Our results establish ECAD as a scalable and generalizable approach for accelerating diffusion inference. Our project website is available at https://aniaggarwal.github.io/ecad and our code is available at https://github.com/aniaggarwal/ecad.

进化缓存技术加速您的现成扩散模型

Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model

摘要

Support