從權衡到協同:面向大型語言模型的多功能共生水印框架
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
May 15, 2025
作者: Yidan Wang, Yubing Ren, Yanan Cao, Binxing Fang
cs.AI
摘要
大型語言模型(LLMs)的興起加劇了對AI生成文本濫用的擔憂,使得浮水印技術成為一項頗具前景的解決方案。目前主流的LLM浮水印方案主要分為兩類:基於logits的和基於採樣的。然而,現有方案在魯棒性、文本質量和安全性之間存在權衡。為緩解這一問題,我們整合了基於logits和基於採樣的方案,發揮各自優勢以實現協同效應。本文提出了一種多功能共生浮水印框架,包含三種策略:串聯、並聯和混合。該混合框架根據詞元熵和語義熵自適應地嵌入浮水印,優化了可檢測性、魯棒性、文本質量和安全性之間的平衡。此外,我們通過在多種數據集和模型上的全面實驗驗證了我們的方法。實驗結果表明,我們的方法優於現有基準,並達到了最先進(SOTA)的性能。我們相信這一框架為多樣化的浮水印範式提供了新穎的見解。我們的代碼可在https://github.com/redwyd/SymMark{https://github.com/redwyd/SymMark}獲取。
English
The rise of Large Language Models (LLMs) has heightened concerns about the
misuse of AI-generated text, making watermarking a promising solution.
Mainstream watermarking schemes for LLMs fall into two categories: logits-based
and sampling-based. However, current schemes entail trade-offs among
robustness, text quality, and security. To mitigate this, we integrate
logits-based and sampling-based schemes, harnessing their respective strengths
to achieve synergy. In this paper, we propose a versatile symbiotic
watermarking framework with three strategies: serial, parallel, and hybrid. The
hybrid framework adaptively embeds watermarks using token entropy and semantic
entropy, optimizing the balance between detectability, robustness, text
quality, and security. Furthermore, we validate our approach through
comprehensive experiments on various datasets and models. Experimental results
indicate that our method outperforms existing baselines and achieves
state-of-the-art (SOTA) performance. We believe this framework provides novel
insights into diverse watermarking paradigms. Our code is available at
https://github.com/redwyd/SymMark{https://github.com/redwyd/SymMark}.Summary
AI-Generated Summary