ChatPaper.aiChatPaper

LLM-DetectAIve:一种用于细粒度机器生成文本检测的工具

LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection

August 8, 2024
作者: Mervat Abassy, Kareem Elozeiri, Alexander Aziz, Minh Ngoc Ta, Raj Vardhan Tomar, Bimarsha Adhikari, Saad El Dine Ahmed, Yuxia Wang, Osama Mohammed Afzal, Zhuohan Xie, Jonibek Mansurov, Ekaterina Artemova, Vladislav Mikhailov, Rui Xing, Jiahui Geng, Hasan Iqbal, Zain Muhammad Mujahid, Tarek Mahmoud, Akim Tsvigun, Alham Fikri Aji, Artem Shelmanov, Nizar Habash, Iryna Gurevych, Preslav Nakov
cs.AI

摘要

大型语言模型(LLMs)广泛可获得,显著增加了机器生成文本(MGTs)的传播。提示操纵的进展加剧了区分文本来源(人工撰写 vs 机器生成)的困难。这引发了对MGTs潜在误用的担忧,特别是在教育和学术领域。在本文中,我们提出了LLM-DetectAIve - 一个旨在进行细粒度MGT检测的系统。它能够将文本分类为四类:人工撰写、机器生成、机器撰写机器人化和人工撰写机器润色。与以往执行二元分类的MGT检测器不同,LLM-DetectAIve中引入两个额外类别可提供关于LLM在文本创建过程中干预程度的见解。这在教育等领域可能很有用,这些领域通常禁止任何LLM干预。实验表明,LLM-DetectAIve能够有效识别文本内容的作者,证明了其在增强教育、学术和其他领域诚信方面的用处。LLM-DetectAIve可在https://huggingface.co/spaces/raj-tomar001/MGT-New 公开访问。描述我们系统的视频可在https://youtu.be/E8eT_bE7k8c观看。
English
The widespread accessibility of large language models (LLMs) to the general public has significantly amplified the dissemination of machine-generated texts (MGTs). Advancements in prompt manipulation have exacerbated the difficulty in discerning the origin of a text (human-authored vs machinegenerated). This raises concerns regarding the potential misuse of MGTs, particularly within educational and academic domains. In this paper, we present LLM-DetectAIve -- a system designed for fine-grained MGT detection. It is able to classify texts into four categories: human-written, machine-generated, machine-written machine-humanized, and human-written machine-polished. Contrary to previous MGT detectors that perform binary classification, introducing two additional categories in LLM-DetectiAIve offers insights into the varying degrees of LLM intervention during the text creation. This might be useful in some domains like education, where any LLM intervention is usually prohibited. Experiments show that LLM-DetectAIve can effectively identify the authorship of textual content, proving its usefulness in enhancing integrity in education, academia, and other domains. LLM-DetectAIve is publicly accessible at https://huggingface.co/spaces/raj-tomar001/MGT-New. The video describing our system is available at https://youtu.be/E8eT_bE7k8c.

Summary

AI-Generated Summary

PDF267November 28, 2024