해시 워터마크를 필터로 활용: 가중치 기반 신경망 워터마킹에서 위조 및 덮어쓰기 공격 방어

초록

가치 있는 디지털 자산으로서, 딥 뉴럴 네트워크는 강력한 소유권 보호가 필요하며, 이에 뉴럴 네트워크 워터마킹(NNW)이 유망한 솔루션으로 자리 잡고 있습니다. 다양한 NNW 접근법 중에서도, 가중치 기반 방법은 그 간결성과 실용성으로 인해 선호되지만, 위조 및 덮어쓰기 공격에 취약한 면이 있습니다. 이러한 문제를 해결하기 위해, 우리는 해시된 워터마크 필터를 중심으로 구축된 강력한 방법인 NeuralMark를 제안합니다. 구체적으로, 우리는 비밀 키로부터 되돌릴 수 없는 이진 워터마크를 생성하기 위해 해시 함수를 사용하며, 이를 통해 모델 파라미터를 선택하여 워터마크를 삽입합니다. 이 설계는 삽입 파라미터를 해시된 워터마크와 교묘하게 결합하여, 위조 및 덮어쓰기 공격에 대한 강력한 방어를 제공합니다. 또한, 미세 조정 및 가지치기 공격에 저항하기 위해 평균 풀링을 통합했습니다. 더 나아가, 이 방법은 다양한 뉴럴 네트워크 아키텍처에 원활하게 통합될 수 있어 광범위한 적용성을 보장합니다. 이론적으로는 보안 경계를 분석했으며, 실증적으로는 13개의 서로 다른 컨볼루션 및 트랜스포머 아키텍처를 통해 5개의 이미지 분류 작업과 1개의 텍스트 생성 작업에서 그 효과성과 견고성을 검증했습니다. 소스 코드는 https://github.com/AIResearch-Group/NeuralMark에서 확인할 수 있습니다.

English

As valuable digital assets, deep neural networks necessitate robust ownership protection, positioning neural network watermarking (NNW) as a promising solution. Among various NNW approaches, weight-based methods are favored for their simplicity and practicality; however, they remain vulnerable to forging and overwriting attacks. To address those challenges, we propose NeuralMark, a robust method built around a hashed watermark filter. Specifically, we utilize a hash function to generate an irreversible binary watermark from a secret key, which is then used as a filter to select the model parameters for embedding. This design cleverly intertwines the embedding parameters with the hashed watermark, providing a robust defense against both forging and overwriting attacks. An average pooling is also incorporated to resist fine-tuning and pruning attacks. Furthermore, it can be seamlessly integrated into various neural network architectures, ensuring broad applicability. Theoretically, we analyze its security boundary. Empirically, we verify its effectiveness and robustness across 13 distinct Convolutional and Transformer architectures, covering five image classification tasks and one text generation task. The source codes are available at https://github.com/AIResearch-Group/NeuralMark.

해시 워터마크를 필터로 활용: 가중치 기반 신경망 워터마킹에서 위조 및 덮어쓰기 공격 방어

Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking

초록

Support