ChatPaper.aiChatPaper

通过双筒望远镜发现LLMs:零样本检测机器生成文本

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

January 22, 2024
作者: Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein
cs.AI

摘要

检测现代大型语言模型生成的文本被认为是困难的,因为LLMs和人类都可能表现出各种复杂行为。然而,我们发现,基于对比两个密切相关的语言模型的得分,在区分人类生成和机器生成的文本方面非常准确。基于这种机制,我们提出了一种新颖的LLM检测器,只需要使用一对预训练的LLMs进行简单计算。这种名为“双筒望远镜”的方法在没有任何训练数据的情况下实现了最先进的准确性。它能够在不进行任何特定于模型的修改的情况下,从各种现代LLMs中发现机器文本。我们对“双筒望远镜”在多个文本来源和不同情况下进行了全面评估。在各种文档类型中,“双筒望远镜”能够在误报率为0.01%的情况下,检测出ChatGPT(以及其他LLMs)生成样本中超过90%的样本,尽管没有接受任何ChatGPT数据的训练。
English
Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text. Based on this mechanism, we propose a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs. The method, called Binoculars, achieves state-of-the-art accuracy without any training data. It is capable of spotting machine text from a range of modern LLMs without any model-specific modifications. We comprehensively evaluate Binoculars on a number of text sources and in varied situations. Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data.
PDF453December 15, 2024