進化式快取技術加速現成擴散模型之效能

摘要

基于扩散的图像生成模型在生成高质量合成内容方面表现出色，但其推理过程缓慢且计算成本高昂。先前的研究尝试通过在扩散变换器内部跨推理步骤缓存和重用特征来缓解这一问题。然而，这些方法往往依赖于僵化的启发式规则，导致加速效果有限或跨架构的泛化能力不佳。我们提出了进化缓存加速扩散模型（ECAD），这是一种遗传算法，它仅使用少量校准提示，就能学习到高效的、针对特定模型的缓存调度，形成帕累托前沿。ECAD无需修改网络参数或参考图像，即可显著提升推理速度，实现对质量-延迟权衡的精细控制，并能无缝适应不同的扩散模型。值得注意的是，ECAD学习到的调度策略能够有效泛化至校准过程中未见过的分辨率和模型变体。我们在PixArt-alpha、PixArt-Sigma和FLUX-1.dev上使用多种指标（FID、CLIP、图像奖励）在多个基准测试（COCO、MJHQ-30k、PartiPrompts）中评估了ECAD，结果显示其相较于以往方法有持续改进。在PixArt-alpha上，ECAD找到了一种调度方案，其COCO FID比之前的最先进方法高出4.47，同时将推理加速比从2.35倍提升至2.58倍。我们的研究结果确立了ECAD作为一种可扩展且可泛化的加速扩散推理的方法。我们的项目网站位于https://aniaggarwal.github.io/ecad，代码可在https://github.com/aniaggarwal/ecad获取。

English

Diffusion-based image generation models excel at producing high-quality synthetic content, but suffer from slow and computationally expensive inference. Prior work has attempted to mitigate this by caching and reusing features within diffusion transformers across inference steps. These methods, however, often rely on rigid heuristics that result in limited acceleration or poor generalization across architectures. We propose Evolutionary Caching to Accelerate Diffusion models (ECAD), a genetic algorithm that learns efficient, per-model, caching schedules forming a Pareto frontier, using only a small set of calibration prompts. ECAD requires no modifications to network parameters or reference images. It offers significant inference speedups, enables fine-grained control over the quality-latency trade-off, and adapts seamlessly to different diffusion models. Notably, ECAD's learned schedules can generalize effectively to resolutions and model variants not seen during calibration. We evaluate ECAD on PixArt-alpha, PixArt-Sigma, and FLUX-1.dev using multiple metrics (FID, CLIP, Image Reward) across diverse benchmarks (COCO, MJHQ-30k, PartiPrompts), demonstrating consistent improvements over previous approaches. On PixArt-alpha, ECAD identifies a schedule that outperforms the previous state-of-the-art method by 4.47 COCO FID while increasing inference speedup from 2.35x to 2.58x. Our results establish ECAD as a scalable and generalizable approach for accelerating diffusion inference. Our project website is available at https://aniaggarwal.github.io/ecad and our code is available at https://github.com/aniaggarwal/ecad.

進化式快取技術加速現成擴散模型之效能

Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model

摘要

Support