哈希水印作為過濾器:在基於權重的神經網絡水印中防範偽造與覆寫攻擊
Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking
July 15, 2025
作者: Yuan Yao, Jin Song, Jian Jin
cs.AI
摘要
作為珍貴的數字資產,深度神經網絡亟需強健的所有權保護,這使得神經網絡水印技術(NNW)成為一項頗具前景的解決方案。在眾多NNW方法中,基於權重的方法因其簡便性和實用性而備受青睞;然而,它們仍易受偽造和覆寫攻擊的威脅。為應對這些挑戰,我們提出了NeuralMark,這是一種圍繞哈希水印濾波器構建的魯棒方法。具體而言,我們利用哈希函數從密鑰生成不可逆的二進制水印,隨後將其作為濾波器來選擇嵌入的模型參數。這一設計巧妙地將嵌入參數與哈希水印交織在一起,為抵禦偽造和覆寫攻擊提供了堅固的防線。此外,還引入了平均池化以抵抗微調和剪枝攻擊。更重要的是,該方法能夠無縫集成到多種神經網絡架構中,確保了廣泛的適用性。理論上,我們分析了其安全邊界。在實踐中,我們在13種不同的卷積和Transformer架構上驗證了其有效性和魯棒性,涵蓋了五種圖像分類任務和一項文本生成任務。源代碼已公開於https://github.com/AIResearch-Group/NeuralMark。
English
As valuable digital assets, deep neural networks necessitate robust ownership
protection, positioning neural network watermarking (NNW) as a promising
solution. Among various NNW approaches, weight-based methods are favored for
their simplicity and practicality; however, they remain vulnerable to forging
and overwriting attacks. To address those challenges, we propose NeuralMark, a
robust method built around a hashed watermark filter. Specifically, we utilize
a hash function to generate an irreversible binary watermark from a secret key,
which is then used as a filter to select the model parameters for embedding.
This design cleverly intertwines the embedding parameters with the hashed
watermark, providing a robust defense against both forging and overwriting
attacks. An average pooling is also incorporated to resist fine-tuning and
pruning attacks. Furthermore, it can be seamlessly integrated into various
neural network architectures, ensuring broad applicability. Theoretically, we
analyze its security boundary. Empirically, we verify its effectiveness and
robustness across 13 distinct Convolutional and Transformer architectures,
covering five image classification tasks and one text generation task. The
source codes are available at https://github.com/AIResearch-Group/NeuralMark.