扩散模型中复制行为的理解与缓解

摘要

像Stable Diffusion这样的扩散模型生成的图像越来越普遍。最近的研究甚至诉讼表明，这些模型倾向于复制它们的训练数据，而用户却并不知情。在本文中，我们首先分析了文本到图像扩散模型中的这一记忆问题。虽然普遍认为训练集中重复的图像是导致推断时内容复制的原因，但我们观察到模型的文本条件也发挥着同样重要的作用。事实上，我们在实验中发现，无条件模型通常不会发生数据复制，而在文本条件下却很常见。受到我们发现的启发，我们提出了几种技术，通过在训练集中对图像标题进行随机化和增强，来减少训练和推断时的数据复制。

English

Images generated by diffusion models like Stable Diffusion are increasingly widespread. Recent works and even lawsuits have shown that these models are prone to replicating their training data, unbeknownst to the user. In this paper, we first analyze this memorization problem in text-to-image diffusion models. While it is widely believed that duplicated images in the training set are responsible for content replication at inference time, we observe that the text conditioning of the model plays a similarly important role. In fact, we see in our experiments that data replication often does not happen for unconditional models, while it is common in the text-conditional case. Motivated by our findings, we then propose several techniques for reducing data replication at both training and inference time by randomizing and augmenting image captions in the training set.

扩散模型中复制行为的理解与缓解

Understanding and Mitigating Copying in Diffusion Models

摘要

Support