SEAL:低秩適應上的交錯白盒浮水印
SEAL: Entangled White-box Watermarks on Low-Rank Adaptation
January 16, 2025
作者: Giyeong Oh, Saejin Kim, Woohyun Cho, Sangkyu Lee, Jiwan Chung, Dokyung Song, Youngjae Yu
cs.AI
摘要
最近,LoRA 及其變體已成為訓練和共享大型預訓練模型的任務特定版本的事實標準策略,這要歸功於其高效性和簡單性。然而,對於 LoRA 權重的版權保護問題,特別是透過基於浮水印的技術,仍未得到充分探討。為了填補這一空白,我們提出了 SEAL(LoRA 權重的安全浮水印),這是 LoRA 的通用白盒浮水印技術。SEAL 在可訓練的 LoRA 權重之間嵌入一個秘密的、不可訓練的矩陣,作為主張所有權的護照。然後,SEAL 通過訓練將護照與 LoRA 權重糾纏在一起,而不會因糾纏而產生額外損失,並在隱藏護照後分發微調後的權重。應用 SEAL 時,我們觀察到在常識推理、文本/視覺指導調整和文本到圖像合成任務中沒有性能下降。我們展示了 SEAL 對各種已知攻擊的強韌性:刪除攻擊、混淆攻擊和模糊攻擊。
English
Recently, LoRA and its variants have become the de facto strategy for
training and sharing task-specific versions of large pretrained models, thanks
to their efficiency and simplicity. However, the issue of copyright protection
for LoRA weights, especially through watermark-based techniques, remains
underexplored. To address this gap, we propose SEAL (SEcure wAtermarking on
LoRA weights), the universal whitebox watermarking for LoRA. SEAL embeds a
secret, non-trainable matrix between trainable LoRA weights, serving as a
passport to claim ownership. SEAL then entangles the passport with the LoRA
weights through training, without extra loss for entanglement, and distributes
the finetuned weights after hiding the passport. When applying SEAL, we
observed no performance degradation across commonsense reasoning,
textual/visual instruction tuning, and text-to-image synthesis tasks. We
demonstrate that SEAL is robust against a variety of known attacks: removal,
obfuscation, and ambiguity attacks.Summary
AI-Generated Summary