基於詞彙偏置的自回歸圖像水印技術：一種抗再生攻擊的方法

摘要

自回归（AR）图像生成模型因其在合成质量上的突破而受到越来越多的关注，这凸显了防止滥用的鲁棒水印技术的必要性。然而，现有的生成过程中水印技术主要针对扩散模型设计，其中水印被嵌入扩散潜在状态中。这种设计对直接适应于通过令牌预测顺序生成图像的AR模型构成了重大挑战。此外，基于扩散的再生攻击能够通过扰动扩散潜在状态有效擦除此类水印。为解决这些挑战，我们提出了词汇偏置水印（Lexical Bias Watermarking, LBW），一种专为AR模型设计、能抵抗再生攻击的新颖框架。LBW通过在生成过程中将令牌选择偏向预定义的绿色列表，直接将水印嵌入令牌映射中。这种方法确保了与现有AR模型的无缝集成，并自然地扩展到事后水印处理。为增强对白盒攻击的安全性，每个图像的绿色列表并非单一，而是从绿色列表池中随机抽取。水印检测通过令牌分布的量化与统计分析完成。大量实验证明，LBW在抵抗再生攻击方面展现出卓越的水印鲁棒性。

English

Autoregressive (AR) image generation models have gained increasing attention for their breakthroughs in synthesis quality, highlighting the need for robust watermarking to prevent misuse. However, existing in-generation watermarking techniques are primarily designed for diffusion models, where watermarks are embedded within diffusion latent states. This design poses significant challenges for direct adaptation to AR models, which generate images sequentially through token prediction. Moreover, diffusion-based regeneration attacks can effectively erase such watermarks by perturbing diffusion latent states. To address these challenges, we propose Lexical Bias Watermarking (LBW), a novel framework designed for AR models that resists regeneration attacks. LBW embeds watermarks directly into token maps by biasing token selection toward a predefined green list during generation. This approach ensures seamless integration with existing AR models and extends naturally to post-hoc watermarking. To increase the security against white-box attacks, instead of using a single green list, the green list for each image is randomly sampled from a pool of green lists. Watermark detection is performed via quantization and statistical analysis of the token distribution. Extensive experiments demonstrate that LBW achieves superior watermark robustness, particularly in resisting regeneration attacks.

基於詞彙偏置的自回歸圖像水印技術：一種抗再生攻擊的方法

Autoregressive Images Watermarking through Lexical Biasing: An Approach Resistant to Regeneration Attack

摘要

Support