ChatPaper.aiChatPaper

使用組合擴散模型進行訓練數據保護

Training Data Protection with Compositional Diffusion Models

August 2, 2023
作者: Aditya Golatkar, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
cs.AI

摘要

我們介紹了區隔擴散模型(CDM),這是一種在不同數據來源上訓練不同擴散模型(或提示)並在推論時任意組合它們的方法。個別模型可以在孤立環境中、不同時間、不同分佈和領域上進行訓練,並且可以後續組合以達到與同時在所有數據上訓練的完美模型相當的性能。此外,每個模型僅包含在訓練期間接觸到的數據子集的信息,從而實現多種形式的訓練數據保護。特別是,CDM 是第一種能夠實現大規模擴散模型的選擇性遺忘和持續學習的方法,同時還可以根據用戶訪問權限提供服務定製模型。CDM 還可以確定生成特定樣本時數據子集的重要性。
English
We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time. The individual models can be trained in isolation, at different times, and on different distributions and domains and can be later composed to achieve performance comparable to a paragon model trained on all data simultaneously. Furthermore, each model only contains information about the subset of the data it was exposed to during training, enabling several forms of training data protection. In particular, CDMs are the first method to enable both selective forgetting and continual learning for large-scale diffusion models, as well as allowing serving customized models based on the user's access rights. CDMs also allow determining the importance of a subset of the data in generating particular samples.
PDF60December 15, 2024