Peccavi：針對AI生成圖像的視覺改述攻擊安全且無失真的水印技術

摘要

歐盟執法機構的一份報告預測，到2026年，高達90%的線上內容可能由人工合成生成，這引起了政策制定者的擔憂。他們警告稱，「生成式人工智慧可能成為政治虛假信息的倍增器。生成式文本、圖像、視頻和音頻的綜合影響可能超越任何單一模態的影響力。」作為回應，加利福尼亞州的AB 3211法案要求對AI生成的圖像、視頻和音頻進行水印標記。然而，隱形水印技術易受篡改的脆弱性以及惡意行為者可能完全繞過這些技術的風險仍然存在。生成式人工智慧驅動的去水印攻擊，尤其是新引入的視覺改寫攻擊，已顯示出完全去除水印的能力，從而生成原始圖像的改寫版本。本文介紹了PECCAVI，這是首個能夠抵禦視覺改寫攻擊且無失真的圖像水印技術。在視覺改寫攻擊中，圖像被修改，但其核心語義區域（稱為非熔點，NMPs）得以保留。PECCAVI策略性地將水印嵌入這些NMPs中，並採用多通道頻域水印技術。此外，它還引入了噪聲打磨技術，以對抗旨在定位NMPs以破壞嵌入水印的反向工程努力，從而增強了水印的耐久性。PECCAVI是模型無關的。所有相關資源和代碼將開源。

English

A report by the European Union Law Enforcement Agency predicts that by 2026, up to 90 percent of online content could be synthetically generated, raising concerns among policymakers, who cautioned that "Generative AI could act as a force multiplier for political disinformation. The combined effect of generative text, images, videos, and audio may surpass the influence of any single modality." In response, California's Bill AB 3211 mandates the watermarking of AI-generated images, videos, and audio. However, concerns remain regarding the vulnerability of invisible watermarking techniques to tampering and the potential for malicious actors to bypass them entirely. Generative AI-powered de-watermarking attacks, especially the newly introduced visual paraphrase attack, have shown an ability to fully remove watermarks, resulting in a paraphrase of the original image. This paper introduces PECCAVI, the first visual paraphrase attack-safe and distortion-free image watermarking technique. In visual paraphrase attacks, an image is altered while preserving its core semantic regions, termed Non-Melting Points (NMPs). PECCAVI strategically embeds watermarks within these NMPs and employs multi-channel frequency domain watermarking. It also incorporates noisy burnishing to counter reverse-engineering efforts aimed at locating NMPs to disrupt the embedded watermark, thereby enhancing durability. PECCAVI is model-agnostic. All relevant resources and codes will be open-sourced.