DNA-GPT:用于无需训练即可检测GPT生成文本的分歧N-Gram分析
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text
May 27, 2023
作者: Xianjun Yang, Wei Cheng, Linda Petzold, William Yang Wang, Haifeng Chen
cs.AI
摘要
大型语言模型(LLMs)显著提升了机器生成文本的流畅度和多样性。然而,这一进展也带来了一个重要挑战,即检测给定文本的来源,而目前关于检测方法的研究落后于LLMs的快速演进。传统基于训练的方法在灵活性方面存在局限,特别是在适应新领域时,它们往往缺乏解释能力。为了弥补这一差距,我们提出了一种名为分歧N-Gram分析(DNA-GPT)的新颖无需训练的检测策略。给定一段文本,我们首先在中间截断它,然后仅使用前面部分作为LLMs的输入,以重新生成新的剩余部分。通过在黑盒或概率分歧中进行N-gram分析,我们可以清晰地说明机器生成文本和人类撰写文本之间的显著差异。我们对来自OpenAI的最先进LLMs进行了广泛实验,包括text-davinci-003、GPT-3.5-turbo和GPT-4,以及开源模型如GPT-NeoX-20B和LLaMa-13B。结果表明,我们的零-shot方法在区分人类和GPT生成文本方面表现出最先进的性能,涵盖了四个英语和一个德语数据集,胜过了OpenAI自己的分类器,后者经过数百万文本的训练。此外,我们的方法提供了合理的解释和证据来支持我们的主张,这是可解释检测的独特特性。我们的方法还能够抵抗修订文本攻击,并且可以解决模型溯源问题。代码可在https://github.com/Xianjun-Yang/DNA-GPT找到。
English
Large language models (LLMs) have notably enhanced the fluency and diversity
of machine-generated text. However, this progress also presents a significant
challenge in detecting the origin of a given text, and current research on
detection methods lags behind the rapid evolution of LLMs. Conventional
training-based methods have limitations in flexibility, particularly when
adapting to new domains, and they often lack explanatory power. To address this
gap, we propose a novel training-free detection strategy called Divergent
N-Gram Analysis (DNA-GPT). Given a text, we first truncate it in the middle and
then use only the preceding portion as input to the LLMs to regenerate the new
remaining parts. By analyzing the differences between the original and new
remaining parts through N-gram analysis in black-box or probability divergence
in white-box, we can clearly illustrate significant discrepancies between
machine-generated and human-written text. We conducted extensive experiments on
the most advanced LLMs from OpenAI, including text-davinci-003, GPT-3.5-turbo,
and GPT-4, as well as open-source models such as GPT-NeoX-20B and LLaMa-13B.
Results show that our zero-shot approach exhibits state-of-the-art performance
in distinguishing between human and GPT-generated text on four English and one
German dataset, outperforming OpenAI's own classifier, which is trained on
millions of text. Additionally, our methods provide reasonable explanations and
evidence to support our claim, which is a unique feature of explainable
detection. Our method is also robust under the revised text attack and can
additionally solve model sourcing. Codes are available at
https://github.com/Xianjun-Yang/DNA-GPT.