拡散モデルにおける幻覚現象の理解：モード補間を通じて

要旨

一般的に、拡散プロセスに基づく画像生成モデルは、トレーニングデータでは決して発生しないサンプル、いわゆる「幻覚」を生成すると言われています。しかし、このような幻覚はどこから来るのでしょうか？本論文では、拡散モデルにおける特定の失敗モードを研究し、これを「モード補間」と呼びます。具体的には、拡散モデルがトレーニングセット内の近接するデータモード間を滑らかに「補間」し、元のトレーニング分布のサポート外のサンプルを生成することを発見しました。この現象により、拡散モデルは実データには存在しないアーティファクト（すなわち幻覚）を生成します。我々はこの現象の原因とその現れを体系的に研究します。1次元および2次元ガウシアンを用いた実験を通じて、拡散モデルのデコーダにおける不連続な損失ランドスケープが、滑らかな近似を行うことで幻覚を引き起こす領域を生み出すことを示します。また、様々な形状を持つ人工データセットを用いた実験を通じて、幻覚が存在しなかった形状の組み合わせを生成することを示します。最後に、拡散モデルが実際にサポート外に出て幻覚を生成していることを認識していることを示します。これは、生成サンプルの軌跡が最終的な逆サンプリングプロセスに向かう際の高い分散によって捉えられます。この分散を捉えるための簡単な指標を用いることで、生成時に幻覚の95%以上を除去しつつ、サポート内のサンプルの96%を保持することができます。我々は、MNISTおよび2次元ガウシアンデータセットを用いた実験を通じて、合成データに対する再帰的トレーニングの崩壊（および安定化）における幻覚（およびその除去）の影響を示すことで、この探求を締めくくります。コードはhttps://github.com/locuslab/diffusion-model-hallucinationで公開しています。

English

Colloquially speaking, image generation models based upon diffusion processes are frequently said to exhibit "hallucinations," samples that could never occur in the training data. But where do such hallucinations come from? In this paper, we study a particular failure mode in diffusion models, which we term mode interpolation. Specifically, we find that diffusion models smoothly "interpolate" between nearby data modes in the training set, to generate samples that are completely outside the support of the original training distribution; this phenomenon leads diffusion models to generate artifacts that never existed in real data (i.e., hallucinations). We systematically study the reasons for, and the manifestation of this phenomenon. Through experiments on 1D and 2D Gaussians, we show how a discontinuous loss landscape in the diffusion model's decoder leads to a region where any smooth approximation will cause such hallucinations. Through experiments on artificial datasets with various shapes, we show how hallucination leads to the generation of combinations of shapes that never existed. Finally, we show that diffusion models in fact know when they go out of support and hallucinate. This is captured by the high variance in the trajectory of the generated sample towards the final few backward sampling process. Using a simple metric to capture this variance, we can remove over 95% of hallucinations at generation time while retaining 96% of in-support samples. We conclude our exploration by showing the implications of such hallucination (and its removal) on the collapse (and stabilization) of recursive training on synthetic data with experiments on MNIST and 2D Gaussians dataset. We release our code at https://github.com/locuslab/diffusion-model-hallucination.

拡散モデルにおける幻覚現象の理解：モード補間を通じて

Understanding Hallucinations in Diffusion Models through Mode Interpolation

要旨

Support