DNA-GPT：用于无需训练即可检测GPT生成文本的分歧N-Gram分析

摘要

大型语言模型（LLMs）显著提升了机器生成文本的流畅度和多样性。然而，这一进展也带来了一个重要挑战，即检测给定文本的来源，而目前关于检测方法的研究落后于LLMs的快速演进。传统基于训练的方法在灵活性方面存在局限，特别是在适应新领域时，它们往往缺乏解释能力。为了弥补这一差距，我们提出了一种名为分歧N-Gram分析（DNA-GPT）的新颖无需训练的检测策略。给定一段文本，我们首先在中间截断它，然后仅使用前面部分作为LLMs的输入，以重新生成新的剩余部分。通过在黑盒或概率分歧中进行N-gram分析，我们可以清晰地说明机器生成文本和人类撰写文本之间的显著差异。我们对来自OpenAI的最先进LLMs进行了广泛实验，包括text-davinci-003、GPT-3.5-turbo和GPT-4，以及开源模型如GPT-NeoX-20B和LLaMa-13B。结果表明，我们的零-shot方法在区分人类和GPT生成文本方面表现出最先进的性能，涵盖了四个英语和一个德语数据集，胜过了OpenAI自己的分类器，后者经过数百万文本的训练。此外，我们的方法提供了合理的解释和证据来支持我们的主张，这是可解释检测的独特特性。我们的方法还能够抵抗修订文本攻击，并且可以解决模型溯源问题。代码可在https://github.com/Xianjun-Yang/DNA-GPT找到。

English

Large language models (LLMs) have notably enhanced the fluency and diversity of machine-generated text. However, this progress also presents a significant challenge in detecting the origin of a given text, and current research on detection methods lags behind the rapid evolution of LLMs. Conventional training-based methods have limitations in flexibility, particularly when adapting to new domains, and they often lack explanatory power. To address this gap, we propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT). Given a text, we first truncate it in the middle and then use only the preceding portion as input to the LLMs to regenerate the new remaining parts. By analyzing the differences between the original and new remaining parts through N-gram analysis in black-box or probability divergence in white-box, we can clearly illustrate significant discrepancies between machine-generated and human-written text. We conducted extensive experiments on the most advanced LLMs from OpenAI, including text-davinci-003, GPT-3.5-turbo, and GPT-4, as well as open-source models such as GPT-NeoX-20B and LLaMa-13B. Results show that our zero-shot approach exhibits state-of-the-art performance in distinguishing between human and GPT-generated text on four English and one German dataset, outperforming OpenAI's own classifier, which is trained on millions of text. Additionally, our methods provide reasonable explanations and evidence to support our claim, which is a unique feature of explainable detection. Our method is also robust under the revised text attack and can additionally solve model sourcing. Codes are available at https://github.com/Xianjun-Yang/DNA-GPT.

DNA-GPT：用于无需训练即可检测GPT生成文本的分歧N-Gram分析

DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text

摘要

Support