图像到图像生成模型的机器遗忘
Machine Unlearning for Image-to-Image Generative Models
February 1, 2024
作者: Guihong Li, Hsiang Hsu, Chun-Fu, Chen, Radu Marculescu
cs.AI
摘要
机器遗忘已经成为一种新的范式,可以有意地从给定模型中忘记数据样本,以符合严格的法规要求。然而,现有的机器遗忘方法主要集中在分类模型上,对于生成模型的遗忘领域相对未被探索。本文作为一座桥梁,填补了这一空白,提供了一个统一的机器遗忘框架,专门针对图像到图像生成模型。在这个框架内,我们提出了一个计算效率高的算法,基于严格的理论分析,展示了对保留样本几乎没有性能降级,同时有效地从遗忘样本中删除信息。对两个大规模数据集ImageNet-1K和Places-365的实证研究进一步表明,我们的算法不依赖于保留样本的可用性,这进一步符合数据保留政策。据我们所知,这项工作是首个专门为图像到图像生成模型量身定制的机器遗忘的系统性、理论性、实证性探索。我们的代码可在https://github.com/jpmorganchase/l2l-generator-unlearning找到。
English
Machine unlearning has emerged as a new paradigm to deliberately forget data
samples from a given model in order to adhere to stringent regulations.
However, existing machine unlearning methods have been primarily focused on
classification models, leaving the landscape of unlearning for generative
models relatively unexplored. This paper serves as a bridge, addressing the gap
by providing a unifying framework of machine unlearning for image-to-image
generative models. Within this framework, we propose a
computationally-efficient algorithm, underpinned by rigorous theoretical
analysis, that demonstrates negligible performance degradation on the retain
samples, while effectively removing the information from the forget samples.
Empirical studies on two large-scale datasets, ImageNet-1K and Places-365,
further show that our algorithm does not rely on the availability of the retain
samples, which further complies with data retention policy. To our best
knowledge, this work is the first that represents systemic, theoretical,
empirical explorations of machine unlearning specifically tailored for
image-to-image generative models. Our code is available at
https://github.com/jpmorganchase/l2l-generator-unlearning.