ChatPaper.aiChatPaper

风格重混:通过风格要素的精馏与扰动实现可解释的作者身份隐匿

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

August 28, 2024
作者: Jillian Fisher, Skyler Hallinan, Ximing Lu, Mitchell Gordon, Zaid Harchaoui, Yejin Choi
cs.AI

摘要

作者身份混淆——即通过重写文本来刻意隐藏作者身份——是一项重要但具有挑战性的任务。当前基于大语言模型的方法缺乏可解释性与可控性,常常忽略作者特有的风格特征,导致整体鲁棒性不足。 针对这一问题,我们提出了StyleRemix,一种自适应且可解释的混淆方法,通过对原始文本中特定的细粒度风格元素进行扰动来实现混淆。该方法采用预训练的低秩自适应模块,能够沿不同风格维度(如正式度与文本长度)对输入文本进行定向重写,同时保持较低的计算成本。经自动评估与人工评估验证,StyleRemix在多个领域均优于现有基线模型及参数量更大的语言模型。 此外,我们发布了AuthorMix数据集(包含来自14位不同作者、4大领域的3万篇高质量长文本)以及DiSC平行语料库(涵盖7个风格维度、16个独特方向的1500篇文本)。
English
Authorship obfuscation, rewriting a text to intentionally obscure the identity of the author, is an important but challenging task. Current methods using large language models (LLMs) lack interpretability and controllability, often ignoring author-specific stylistic features, resulting in less robust performance overall. To address this, we develop StyleRemix, an adaptive and interpretable obfuscation method that perturbs specific, fine-grained style elements of the original input text. StyleRemix uses pre-trained Low Rank Adaptation (LoRA) modules to rewrite an input specifically along various stylistic axes (e.g., formality and length) while maintaining low computational cost. StyleRemix outperforms state-of-the-art baselines and much larger LLMs in a variety of domains as assessed by both automatic and human evaluation. Additionally, we release AuthorMix, a large set of 30K high-quality, long-form texts from a diverse set of 14 authors and 4 domains, and DiSC, a parallel corpus of 1,500 texts spanning seven style axes in 16 unique directions
PDF114November 14, 2024