整體性遺忘基準:用於文本到圖像擴散模型遺忘的多方面評估
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning
October 8, 2024
作者: Saemi Moon, Minjong Lee, Sangdon Park, Dongwoo Kim
cs.AI
摘要
隨著文本到圖像擴散模型足夠先進以應用於商業領域,人們也越來越關注這些模型可能被惡意使用的潛在風險。模型遺忘被提出來減輕這些擔憂,通過從預先訓練的模型中刪除不需要的、潛在有害的信息。到目前為止,遺忘的成功主要通過未遺忘的模型是否能生成目標概念並保持圖像質量來衡量。然而,遺忘通常在有限的情境下進行測試,目前文獻中對遺忘的副作用幾乎沒有研究。在這項工作中,我們徹底分析了在五個關鍵方面的各種情境下的遺忘。我們的研究揭示了每種方法都存在副作用或限制,特別是在更複雜和現實情況下。通過釋放我們的全面評估框架以及源代碼和工件,我們希望激發該領域的進一步研究,從而引領出更可靠和有效的遺忘方法。
English
As text-to-image diffusion models become advanced enough for commercial
applications, there is also increasing concern about their potential for
malicious and harmful use. Model unlearning has been proposed to mitigate the
concerns by removing undesired and potentially harmful information from the
pre-trained model. So far, the success of unlearning is mainly measured by
whether the unlearned model can generate a target concept while maintaining
image quality. However, unlearning is typically tested under limited scenarios,
and the side effects of unlearning have barely been studied in the current
literature. In this work, we thoroughly analyze unlearning under various
scenarios with five key aspects. Our investigation reveals that every method
has side effects or limitations, especially in more complex and realistic
situations. By releasing our comprehensive evaluation framework with the source
codes and artifacts, we hope to inspire further research in this area, leading
to more reliable and effective unlearning methods.Summary
AI-Generated Summary