拡散モデルにおけるコピー行動の理解と緩和

要旨

Stable Diffusionのような拡散モデルによって生成された画像がますます広まっています。最近の研究や訴訟から、これらのモデルがユーザーに知られることなく訓練データを複製しやすいことが明らかになっています。本論文では、まずテキストから画像への拡散モデルにおけるこの記憶化問題を分析します。訓練セット内の重複画像が推論時のコンテンツ複製の原因であると広く信じられていますが、モデルのテキスト条件付けも同様に重要な役割を果たしていることが観察されます。実際、私たちの実験では、無条件モデルではデータ複製が起こらないことが多いのに対し、テキスト条件付きの場合ではそれが一般的であることが分かります。この発見に基づき、訓練セット内の画像キャプションをランダム化および拡張することで、訓練時と推論時の両方でデータ複製を減らすためのいくつかの技術を提案します。

English

Images generated by diffusion models like Stable Diffusion are increasingly widespread. Recent works and even lawsuits have shown that these models are prone to replicating their training data, unbeknownst to the user. In this paper, we first analyze this memorization problem in text-to-image diffusion models. While it is widely believed that duplicated images in the training set are responsible for content replication at inference time, we observe that the text conditioning of the model plays a similarly important role. In fact, we see in our experiments that data replication often does not happen for unconditional models, while it is common in the text-conditional case. Motivated by our findings, we then propose several techniques for reducing data replication at both training and inference time by randomizing and augmenting image captions in the training set.

拡散モデルにおけるコピー行動の理解と緩和

Understanding and Mitigating Copying in Diffusion Models

要旨

Support