オフ・ザ・シェルフ拡散モデルを加速するための進化的キャッシング

要旨

拡散モデルに基づく画像生成は高品質な合成コンテンツの生成に優れていますが、推論が遅く計算コストが高いという課題があります。これまでの研究では、拡散トランスフォーマー内の特徴量をキャッシュし、推論ステップ間で再利用することでこの問題を緩和しようと試みてきました。しかし、これらの手法はしばしば硬直的なヒューリスティックに依存しており、限定的な高速化しか達成できないか、アーキテクチャ間での汎化性能が低いという問題がありました。本研究では、進化的キャッシングによる拡散モデル高速化手法（ECAD）を提案します。ECADは遺伝的アルゴリズムを用いて、少数のキャリブレーションプロンプトのみを使用し、モデルごとに効率的なキャッシングスケジュールを学習し、パレートフロンティアを形成します。ECADはネットワークパラメータや参照画像の変更を必要とせず、大幅な推論速度の向上を実現し、品質とレイテンシのトレードオフを細かく制御可能で、異なる拡散モデルにシームレスに適応します。特に、ECADが学習したスケジュールは、キャリブレーション中に見られなかった解像度やモデルバリアントに対しても効果的に汎化します。PixArt-alpha、PixArt-Sigma、FLUX-1.devにおいて、複数のメトリクス（FID、CLIP、Image Reward）を用いて多様なベンチマーク（COCO、MJHQ-30k、PartiPrompts）で評価を行い、従来手法を一貫して上回る改善を示しました。PixArt-alphaでは、ECADは従来の最先端手法を4.47 COCO FIDで上回り、推論速度を2.35倍から2.58倍に向上させるスケジュールを特定しました。我々の結果は、ECADが拡散推論を高速化するためのスケーラブルで汎化可能なアプローチであることを示しています。プロジェクトウェブサイトはhttps://aniaggarwal.github.io/ecad、コードはhttps://github.com/aniaggarwal/ecadで公開されています。

English

Diffusion-based image generation models excel at producing high-quality synthetic content, but suffer from slow and computationally expensive inference. Prior work has attempted to mitigate this by caching and reusing features within diffusion transformers across inference steps. These methods, however, often rely on rigid heuristics that result in limited acceleration or poor generalization across architectures. We propose Evolutionary Caching to Accelerate Diffusion models (ECAD), a genetic algorithm that learns efficient, per-model, caching schedules forming a Pareto frontier, using only a small set of calibration prompts. ECAD requires no modifications to network parameters or reference images. It offers significant inference speedups, enables fine-grained control over the quality-latency trade-off, and adapts seamlessly to different diffusion models. Notably, ECAD's learned schedules can generalize effectively to resolutions and model variants not seen during calibration. We evaluate ECAD on PixArt-alpha, PixArt-Sigma, and FLUX-1.dev using multiple metrics (FID, CLIP, Image Reward) across diverse benchmarks (COCO, MJHQ-30k, PartiPrompts), demonstrating consistent improvements over previous approaches. On PixArt-alpha, ECAD identifies a schedule that outperforms the previous state-of-the-art method by 4.47 COCO FID while increasing inference speedup from 2.35x to 2.58x. Our results establish ECAD as a scalable and generalizable approach for accelerating diffusion inference. Our project website is available at https://aniaggarwal.github.io/ecad and our code is available at https://github.com/aniaggarwal/ecad.

オフ・ザ・シェルフ拡散モデルを加速するための進化的キャッシング

Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model

要旨

Support