DNA-GPT：用於無需訓練即可檢測 GPT 生成文本的分歧 N-Gram 分析

摘要

大型語言模型（LLMs）顯著提升了機器生成文本的流暢度和多樣性。然而，這種進步也帶來了一個重要挑戰，即檢測給定文本的來源，而目前關於檢測方法的研究落後於LLMs的快速演進。傳統基於訓練的方法在靈活性方面存在局限，特別是在適應新領域時，它們往往缺乏解釋能力。為了解決這一問題，我們提出了一種新穎的無需訓練的檢測策略，稱為分歧N-Gram分析（DNA-GPT）。給定一段文本，我們首先將其在中間截斷，然後僅使用前部分作為LLMs的輸入，以重新生成新的剩餘部分。通過在黑盒或概率分歧中進行N-gram分析，我們可以清楚地說明機器生成文本和人類寫作文本之間的顯著差異。我們對來自OpenAI的最先進的LLMs進行了廣泛實驗，包括text-davinci-003、GPT-3.5-turbo和GPT-4，以及GPT-NeoX-20B和LLaMa-13B等開源模型。結果顯示，我們的零樣本方法在區分人類和GPT生成文本方面表現出最先進的性能，並在四個英語和一個德語數據集上優於OpenAI自己的分類器，後者是在數百萬文本上訓練的。此外，我們的方法提供了合理的解釋和證據來支持我們的主張，這是可解釋檢測的一個獨特特點。我們的方法還能夠抵抗修改後的文本攻擊，並且還可以解決模型來源問題。代碼可在https://github.com/Xianjun-Yang/DNA-GPT找到。

English

Large language models (LLMs) have notably enhanced the fluency and diversity of machine-generated text. However, this progress also presents a significant challenge in detecting the origin of a given text, and current research on detection methods lags behind the rapid evolution of LLMs. Conventional training-based methods have limitations in flexibility, particularly when adapting to new domains, and they often lack explanatory power. To address this gap, we propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT). Given a text, we first truncate it in the middle and then use only the preceding portion as input to the LLMs to regenerate the new remaining parts. By analyzing the differences between the original and new remaining parts through N-gram analysis in black-box or probability divergence in white-box, we can clearly illustrate significant discrepancies between machine-generated and human-written text. We conducted extensive experiments on the most advanced LLMs from OpenAI, including text-davinci-003, GPT-3.5-turbo, and GPT-4, as well as open-source models such as GPT-NeoX-20B and LLaMa-13B. Results show that our zero-shot approach exhibits state-of-the-art performance in distinguishing between human and GPT-generated text on four English and one German dataset, outperforming OpenAI's own classifier, which is trained on millions of text. Additionally, our methods provide reasonable explanations and evidence to support our claim, which is a unique feature of explainable detection. Our method is also robust under the revised text attack and can additionally solve model sourcing. Codes are available at https://github.com/Xianjun-Yang/DNA-GPT.

DNA-GPT：用於無需訓練即可檢測 GPT 生成文本的分歧 N-Gram 分析

DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text

摘要

Support