DreamCache: 特徴キャッシングを介したフィントゥーニング不要の軽量個人化画像生成

要旨

個人化画像生成には、異なる文脈で制御された生成を可能にするために、参照主題の中核的な特徴を捉えるテキストから画像生成モデルが必要です。既存の手法は、複雑なトレーニング要件、高い推論コスト、限られた柔軟性、またはこれらの問題の組み合わせによる課題に直面しています。本論文では、効率的で高品質な個人化画像生成のためのスケーラブルなアプローチであるDreamCacheを紹介します。DreamCacheは、事前に学習された拡散ノイザーの一つのタイムステップと、一部のレイヤーからの少数の参照画像特徴をキャッシュすることにより、軽量で訓練された調整アダプターを介して生成された画像特徴の動的調整を可能にします。DreamCacheは、追加のパラメータが桁違いに少なく、既存のモデルよりも計算効率が高く、汎用性があり、最先端の画像とテキストの整合性を実現しています。

English

Personalized image generation requires text-to-image generative models that capture the core features of a reference subject to allow for controlled generation across different contexts. Existing methods face challenges due to complex training requirements, high inference costs, limited flexibility, or a combination of these issues. In this paper, we introduce DreamCache, a scalable approach for efficient and high-quality personalized image generation. By caching a small number of reference image features from a subset of layers and a single timestep of the pretrained diffusion denoiser, DreamCache enables dynamic modulation of the generated image features through lightweight, trained conditioning adapters. DreamCache achieves state-of-the-art image and text alignment, utilizing an order of magnitude fewer extra parameters, and is both more computationally effective and versatile than existing models.

DreamCache: 特徴キャッシングを介したフィントゥーニング不要の軽量個人化画像生成

DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching

要旨

Support