從 LoRA 權重中恢復資料集大小
Dataset Size Recovery from LoRA Weights
June 27, 2024
作者: Mohammad Salama, Jonathan Kahana, Eliahu Horwitz, Yedid Hoshen
cs.AI
摘要
模型反推與成員推斷攻擊的目標是重建並驗證模型訓練的數據。然而,由於不知道訓練集的大小,它們並不能保證找到所有訓練樣本。本文介紹了一個新任務:資料集大小恢復,旨在直接從模型的權重中確定用於訓練模型的樣本數量。然後,我們提出了 DSiRe,一種用於恢復用於微調模型的圖像數量的方法,在微調使用 LoRA 的常見情況下。我們發現 LoRA 矩陣的範數和頻譜與微調資料集大小密切相關;我們利用這一發現提出了一種簡單但有效的預測算法。為了評估 LoRA 權重的資料集大小恢復,我們開發並釋出了一個新基準 LoRA-WiSE,其中包含來自2000多個不同 LoRA 微調模型的超過 25000 個權重快照。我們最佳的分類器可以預測微調圖像的數量,平均絕對誤差為 0.36 張圖像,確立了這種攻擊的可行性。
English
Model inversion and membership inference attacks aim to reconstruct and
verify the data which a model was trained on. However, they are not guaranteed
to find all training samples as they do not know the size of the training set.
In this paper, we introduce a new task: dataset size recovery, that aims to
determine the number of samples used to train a model, directly from its
weights. We then propose DSiRe, a method for recovering the number of images
used to fine-tune a model, in the common case where fine-tuning uses LoRA. We
discover that both the norm and the spectrum of the LoRA matrices are closely
linked to the fine-tuning dataset size; we leverage this finding to propose a
simple yet effective prediction algorithm. To evaluate dataset size recovery of
LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of
over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models.
Our best classifier can predict the number of fine-tuning images with a mean
absolute error of 0.36 images, establishing the feasibility of this attack.Summary
AI-Generated Summary