自迴歸圖像生成中的數字水印技術

摘要

生成模型输出水印技术已成为追踪其来源的一种前景广阔的方法。尽管自回归图像生成模型及其潜在滥用引起了广泛关注，但此前尚无研究尝试在标记级别对其输出进行水印处理。本研究首次通过将语言模型水印技术调整应用于此场景，提出了一种创新方法。我们识别出一个关键挑战：缺乏反向循环一致性（RCC），即重新标记生成的图像标记会显著改变标记序列，从而有效抹除水印。为解决此问题，并增强我们的方法对常见图像变换、神经压缩及移除攻击的鲁棒性，我们引入了（i）一种定制化的标记器-去标记器微调程序，以提升RCC，以及（ii）一个互补的水印同步层。实验证明，我们的方法能够实现可靠且鲁棒的水印检测，并提供理论依据的p值。

English

Watermarking the outputs of generative models has emerged as a promising approach for tracking their provenance. Despite significant interest in autoregressive image generation models and their potential for misuse, no prior work has attempted to watermark their outputs at the token level. In this work, we present the first such approach by adapting language model watermarking techniques to this setting. We identify a key challenge: the lack of reverse cycle-consistency (RCC), wherein re-tokenizing generated image tokens significantly alters the token sequence, effectively erasing the watermark. To address this and to make our method robust to common image transformations, neural compression, and removal attacks, we introduce (i) a custom tokenizer-detokenizer finetuning procedure that improves RCC, and (ii) a complementary watermark synchronization layer. As our experiments demonstrate, our approach enables reliable and robust watermark detection with theoretically grounded p-values.

自迴歸圖像生成中的數字水印技術

Watermarking Autoregressive Image Generation

摘要

Support