자기회귀 이미지 생성에 워터마킹 적용하기

초록

생성 모델의 출력에 워터마크를 삽입하는 것은 그 출처를 추적하기 위한 유망한 접근법으로 부상했다. 자동회귀적 이미지 생성 모델과 그 오용 가능성에 대한 상당한 관심에도 불구하고, 이전 연구에서는 토큰 수준에서 그 출력에 워터마크를 삽입하려는 시도가 없었다. 본 연구에서는 언어 모델 워터마킹 기법을 이 설정에 적용하여 최초로 그러한 접근법을 제시한다. 우리는 주요 과제를 확인했다: 생성된 이미지 토큰을 다시 토큰화할 때 토큰 시퀀스가 크게 변경되어 워터마크가 사실상 지워지는 역순환 일관성(RCC)의 부재이다. 이를 해결하고 일반적인 이미지 변환, 신경망 기반 압축, 제거 공격에 대해 우리의 방법을 견고하게 만들기 위해, 우리는 (i) RCC를 개선하는 맞춤형 토크나이저-디토크나이저 미세 조정 절차와 (ii) 보완적인 워터마크 동기화 계층을 도입했다. 우리의 실험 결과에 따르면, 이 접근법은 이론적으로 근거를 둔 p-값을 통해 신뢰할 수 있고 견고한 워터마크 검출을 가능하게 한다.

English

Watermarking the outputs of generative models has emerged as a promising approach for tracking their provenance. Despite significant interest in autoregressive image generation models and their potential for misuse, no prior work has attempted to watermark their outputs at the token level. In this work, we present the first such approach by adapting language model watermarking techniques to this setting. We identify a key challenge: the lack of reverse cycle-consistency (RCC), wherein re-tokenizing generated image tokens significantly alters the token sequence, effectively erasing the watermark. To address this and to make our method robust to common image transformations, neural compression, and removal attacks, we introduce (i) a custom tokenizer-detokenizer finetuning procedure that improves RCC, and (ii) a complementary watermark synchronization layer. As our experiments demonstrate, our approach enables reliable and robust watermark detection with theoretically grounded p-values.

자기회귀 이미지 생성에 워터마킹 적용하기

Watermarking Autoregressive Image Generation

초록

Support