확산 모델에서의 복제 현상 이해 및 완화

초록

Stable Diffusion과 같은 확산 모델로 생성된 이미지가 점점 더 널리 퍼지고 있습니다. 최근 연구와 소송 사례에서도 이러한 모델들이 사용자에게 알려지지 않은 상태에서 학습 데이터를 복제하는 경향이 있음이 밝혀졌습니다. 본 논문에서는 먼저 텍스트-이미지 확산 모델에서의 이러한 기억화 문제를 분석합니다. 학습 데이터셋 내 중복 이미지가 추론 시 콘텐츠 복제의 원인으로 널리 알려져 있지만, 우리는 모델의 텍스트 조건화가 이와 유사하게 중요한 역할을 한다는 점을 관찰했습니다. 실제로 실험에서 무조건 모델의 경우 데이터 복제가 자주 발생하지 않는 반면, 텍스트 조건 모델에서는 흔히 발생함을 확인했습니다. 이러한 발견을 바탕으로, 우리는 학습 데이터셋 내 이미지 캡션을 무작위화하고 증강함으로써 학습 및 추론 시 데이터 복제를 줄이는 여러 기술을 제안합니다.

English

Images generated by diffusion models like Stable Diffusion are increasingly widespread. Recent works and even lawsuits have shown that these models are prone to replicating their training data, unbeknownst to the user. In this paper, we first analyze this memorization problem in text-to-image diffusion models. While it is widely believed that duplicated images in the training set are responsible for content replication at inference time, we observe that the text conditioning of the model plays a similarly important role. In fact, we see in our experiments that data replication often does not happen for unconditional models, while it is common in the text-conditional case. Motivated by our findings, we then propose several techniques for reducing data replication at both training and inference time by randomizing and augmenting image captions in the training set.

확산 모델에서의 복제 현상 이해 및 완화

Understanding and Mitigating Copying in Diffusion Models

초록

Support