自己回帰型画像生成のためのトレーニング不要な透かし技術

要旨

不可視画像透かしは、画像の所有権を保護し、視覚生成モデルの悪意ある誤用を防ぐことができます。しかし、既存の生成透かし手法は主に拡散モデル向けに設計されており、自己回帰型画像生成モデルに対する透かし技術はほとんど未開拓のままです。本研究では、自己回帰型画像生成モデル向けのトレーニング不要な透かしフレームワークであるIndexMarkを提案します。IndexMarkは、コードブックの冗長性に着想を得ています。つまり、自己回帰的に生成されたインデックスを類似のインデックスに置き換えても、視覚的な差異はほとんど生じません。IndexMarkの中核となるのは、シンプルでありながら効果的なマッチ・アンド・リプレース手法です。この手法は、トークンの類似性に基づいてコードブックから透かしトークンを慎重に選択し、トークン置換を通じて透かしトークンの使用を促進することで、画像品質に影響を与えることなく透かしを埋め込みます。透かしの検証は、生成された画像中の透かしトークンの割合を計算することで行われ、Index Encoderによって精度がさらに向上します。さらに、切り抜き攻撃に対する堅牢性を高めるために、補助的な検証スキームを導入します。実験結果は、IndexMarkが画像品質と検証精度の両面で最先端の性能を達成し、切り抜き、ノイズ、ガウスぼかし、ランダム消去、色のジッター、JPEG圧縮など、さまざまな摂動に対して堅牢性を示すことを実証しています。

English

Invisible image watermarking can protect image ownership and prevent malicious misuse of visual generative models. However, existing generative watermarking methods are mainly designed for diffusion models while watermarking for autoregressive image generation models remains largely underexplored. We propose IndexMark, a training-free watermarking framework for autoregressive image generation models. IndexMark is inspired by the redundancy property of the codebook: replacing autoregressively generated indices with similar indices produces negligible visual differences. The core component in IndexMark is a simple yet effective match-then-replace method, which carefully selects watermark tokens from the codebook based on token similarity, and promotes the use of watermark tokens through token replacement, thereby embedding the watermark without affecting the image quality. Watermark verification is achieved by calculating the proportion of watermark tokens in generated images, with precision further improved by an Index Encoder. Furthermore, we introduce an auxiliary validation scheme to enhance robustness against cropping attacks. Experiments demonstrate that IndexMark achieves state-of-the-art performance in terms of image quality and verification accuracy, and exhibits robustness against various perturbations, including cropping, noises, Gaussian blur, random erasing, color jittering, and JPEG compression.

自己回帰型画像生成のためのトレーニング不要な透かし技術

Training-Free Watermarking for Autoregressive Image Generation

要旨

Support