ChatPaper.aiChatPaper

机器文本检测器即成员推断攻击

Machine Text Detectors are Membership Inference Attacks

October 22, 2025
作者: Ryuto Koike, Liam Dugan, Masahiro Kaneko, Chris Callison-Burch, Naoaki Okazaki
cs.AI

摘要

尽管成员推断攻击(MIAs)与机器生成文本检测针对不同目标——识别训练样本和合成文本,但它们的常用方法往往基于语言模型的概率分布,利用相似的信号。尽管存在这一共同的方法论基础,这两项任务却一直独立研究,可能导致结论忽视了另一任务中开发的更强方法和宝贵见解。在本研究中,我们从理论和实证角度探讨了MIAs与机器文本检测之间的可迁移性,即一项任务开发的方法在另一任务上的表现如何。作为理论贡献,我们证明了在这两项任务上达到渐近最高性能的度量标准是相同的。我们在此最优度量标准的框架下统一了大量现有文献,并假设一个方法近似该度量标准的准确度与其可迁移性直接相关。我们的大规模实证实验,涵盖了7种最先进的MIA方法和5种最先进的机器文本检测器,跨越13个领域和10种生成器,显示出跨任务性能中非常强的秩相关性(rho > 0.6)。特别值得注意的是,最初为机器文本检测设计的Binoculars,在MIA基准测试中也达到了最先进的性能,展示了可迁移性的实际影响。我们的发现强调了两大研究社区之间需要更强的跨任务意识与合作。为了促进跨任务发展和公平评估,我们引入了MINT,一个统一的评估套件,用于MIAs和机器生成文本检测,其中实现了来自两项任务的15种最新方法。
English
Although membership inference attacks (MIAs) and machine-generated text detection target different goals, identifying training samples and synthetic texts, their methods often exploit similar signals based on a language model's probability distribution. Despite this shared methodological foundation, the two tasks have been independently studied, which may lead to conclusions that overlook stronger methods and valuable insights developed in the other task. In this work, we theoretically and empirically investigate the transferability, i.e., how well a method originally developed for one task performs on the other, between MIAs and machine text detection. For our theoretical contribution, we prove that the metric that achieves the asymptotically highest performance on both tasks is the same. We unify a large proportion of the existing literature in the context of this optimal metric and hypothesize that the accuracy with which a given method approximates this metric is directly correlated with its transferability. Our large-scale empirical experiments, including 7 state-of-the-art MIA methods and 5 state-of-the-art machine text detectors across 13 domains and 10 generators, demonstrate very strong rank correlation (rho > 0.6) in cross-task performance. We notably find that Binoculars, originally designed for machine text detection, achieves state-of-the-art performance on MIA benchmarks as well, demonstrating the practical impact of the transferability. Our findings highlight the need for greater cross-task awareness and collaboration between the two research communities. To facilitate cross-task developments and fair evaluations, we introduce MINT, a unified evaluation suite for MIAs and machine-generated text detection, with implementation of 15 recent methods from both tasks.
PDF11October 23, 2025