SeaCache: 拡散モデル高速化のためのスペクトル進化対応キャッシュ

要旨

拡散モデルは視覚生成における強力な基盤技術であるが、本質的に逐次的なノイズ除去プロセスにより推論速度が遅いという課題がある。従来の高速化手法は、隣接するタイムステップ間の特徴量距離に基づいて中間出力をキャッシュし再利用する。しかし、既存のキャッシュ戦略は、内容情報とノイズが混在した生の特徴量の差異に依存する場合が多く、スペクトル進化（低周波数構造が早期に現れ、高周波数詳細が後から精緻化される過程）を考慮していない。本研究では、スペクトル的に整列された表現に基づいて再利用判断を行う、学習不要なキャッシュスケジューリング手法「Spectral-Evolution-Aware Cache（SeaCache）」を提案する。理論的・実証的分析を通じて、ノイズを抑制しつつ内容に関連する成分を保持するスペクトル進化対応（SEA）フィルタを導出する。SEAフィルタ処理された入力特徴量を用いて冗長性を推定することで、拡散モデルの基礎となるスペクトル事前分布を尊重しつつ、コンテンツに適応する動的スケジュールを実現する。多様な視覚生成モデルとベースラインを用いた大規模実験により、SeaCacheがレイテンシと品質のトレードオフにおいて最先端の性能を達成することを示す。

English

Diffusion models are a strong backbone for visual generation, but their inherently sequential denoising process leads to slow inference. Previous methods accelerate sampling by caching and reusing intermediate outputs based on feature distances between adjacent timesteps. However, existing caching strategies typically rely on raw feature differences that entangle content and noise. This design overlooks spectral evolution, where low-frequency structure appears early and high-frequency detail is refined later. We introduce Spectral-Evolution-Aware Cache (SeaCache), a training-free cache schedule that bases reuse decisions on a spectrally aligned representation. Through theoretical and empirical analysis, we derive a Spectral-Evolution-Aware (SEA) filter that preserves content-relevant components while suppressing noise. Employing SEA-filtered input features to estimate redundancy leads to dynamic schedules that adapt to content while respecting the spectral priors underlying the diffusion model. Extensive experiments on diverse visual generative models and the baselines show that SeaCache achieves state-of-the-art latency-quality trade-offs.

SeaCache: 拡散モデル高速化のためのスペクトル進化対応キャッシュ

SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models

要旨

Support