TAROT：使用策略优化进行任务导向的作者身份混淆方法

摘要

作者身份混淆旨在通过改变写作风格、词汇、句法和其他与文本作者相关的语言特征来掩盖作者的身份。这种改变需要在隐私和实用性之间取得平衡。虽然强大的混淆技术可以有效地隐藏作者的身份，但它们通常会降低文本的质量和实用性，不利于其预期用途。相反，保持高实用性往往会提供不足的隐私，使对手更容易对作者进行去匿名化。因此，在这两个相互冲突的目标之间实现最佳权衡至关重要。在本文中，我们提出了TAROT：基于任务的作者身份混淆使用策略优化，这是一种新的无监督作者身份混淆方法，其目标是通过重新生成整个文本来考虑其下游实用性，从而优化隐私和实用性之间的权衡。我们的方法利用策略优化作为对小语言模型进行微调的范式，以重写文本，同时保留作者身份和下游任务实用性。我们展示了我们的方法大大降低了攻击者的准确性，同时保持了实用性。我们公开提供我们的代码和模型。

English

Authorship obfuscation aims to disguise the identity of an author within a text by altering the writing style, vocabulary, syntax, and other linguistic features associated with the text author. This alteration needs to balance privacy and utility. While strong obfuscation techniques can effectively hide the author's identity, they often degrade the quality and usefulness of the text for its intended purpose. Conversely, maintaining high utility tends to provide insufficient privacy, making it easier for an adversary to de-anonymize the author. Thus, achieving an optimal trade-off between these two conflicting objectives is crucial. In this paper, we propose TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization, a new unsupervised authorship obfuscation method whose goal is to optimize the privacy-utility trade-off by regenerating the entire text considering its downstream utility. Our approach leverages policy optimization as a fine-tuning paradigm over small language models in order to rewrite texts by preserving author identity and downstream task utility. We show that our approach largely reduce the accuracy of attackers while preserving utility. We make our code and models publicly available.

TAROT：使用策略优化进行任务导向的作者身份混淆方法

TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods

摘要

Support