TAROT：使用策略優化的任務導向作者身份混淆方法

摘要

作者身份混淆的目的在於透過改變寫作風格、詞彙、語法和其他與作者相關的語言特徵，來掩蓋文本中作者的身份。這種改變需要在隱私和實用性之間取得平衡。儘管強大的混淆技術能有效地隱藏作者的身份，但通常會降低文本的品質和實用性。相反，保持高度的實用性往往會提供不足的隱私，使對手更容易對作者進行去匿名化。因此，在這兩個相互衝突的目標之間取得最佳的權衡至關重要。在本文中，我們提出了 TAROT：基於任務的作者身份混淆使用策略優化，這是一種新的無監督作者身份混淆方法，其目標是通過重新生成整個文本來考慮其下游實用性，來優化隱私和實用性之間的權衡。我們的方法利用策略優化作為一種在小語言模型上的微調範式，以重寫文本，同時保留作者身份和下游任務實用性。我們展示了我們的方法在保留實用性的同時大幅降低攻擊者的準確性。我們將我們的代碼和模型公開提供。

English

Authorship obfuscation aims to disguise the identity of an author within a text by altering the writing style, vocabulary, syntax, and other linguistic features associated with the text author. This alteration needs to balance privacy and utility. While strong obfuscation techniques can effectively hide the author's identity, they often degrade the quality and usefulness of the text for its intended purpose. Conversely, maintaining high utility tends to provide insufficient privacy, making it easier for an adversary to de-anonymize the author. Thus, achieving an optimal trade-off between these two conflicting objectives is crucial. In this paper, we propose TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization, a new unsupervised authorship obfuscation method whose goal is to optimize the privacy-utility trade-off by regenerating the entire text considering its downstream utility. Our approach leverages policy optimization as a fine-tuning paradigm over small language models in order to rewrite texts by preserving author identity and downstream task utility. We show that our approach largely reduce the accuracy of attackers while preserving utility. We make our code and models publicly available.

TAROT：使用策略優化的任務導向作者身份混淆方法

TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods

摘要

Support