机器人还是人类?用一个问题检测ChatGPT冒名者
Bot or Human? Detecting ChatGPT Imposters with A Single Question
May 10, 2023
作者: Hong Wang, Xuan Luo, Weizhi Wang, Xifeng Yan
cs.AI
摘要
像ChatGPT这样的大型语言模型最近展示了在自然语言理解和生成方面的令人印象深刻的能力,实现了包括翻译、写作和闲聊在内的各种应用。然而,人们担心它们可能被用于恶意目的,比如欺诈或拒绝服务攻击。因此,开发出一种方法来检测对话中涉及的一方是机器人还是人类至关重要。在本文中,我们提出了一个名为FLAIR的框架,通过一次询问和回答来检测在线对话中的对话机器人。具体来说,我们针对一种可以有效区分人类用户和机器人的单一问题场景。这些问题分为两类:对人类容易但对机器人困难的问题(例如计数、替换、定位、噪音过滤和ASCII艺术),以及对机器人容易但对人类困难的问题(例如记忆和计算)。我们的方法展示了这些问题在效果上的不同优势,为在线服务提供商提供了一种新的方式来保护自己免受恶意活动的侵害,并确保他们正在为真实用户提供服务。我们在https://github.com/hongwang600/FLAIR上开源了我们的数据集,并欢迎社区贡献以丰富这样的检测数据集。
English
Large language models like ChatGPT have recently demonstrated impressive
capabilities in natural language understanding and generation, enabling various
applications including translation, essay writing, and chit-chatting. However,
there is a concern that they can be misused for malicious purposes, such as
fraud or denial-of-service attacks. Therefore, it is crucial to develop methods
for detecting whether the party involved in a conversation is a bot or a human.
In this paper, we propose a framework named FLAIR, Finding Large language model
Authenticity via a single Inquiry and Response, to detect conversational bots
in an online manner. Specifically, we target a single question scenario that
can effectively differentiate human users from bots. The questions are divided
into two categories: those that are easy for humans but difficult for bots
(e.g., counting, substitution, positioning, noise filtering, and ASCII art),
and those that are easy for bots but difficult for humans (e.g., memorization
and computation). Our approach shows different strengths of these questions in
their effectiveness, providing a new way for online service providers to
protect themselves against nefarious activities and ensure that they are
serving real users. We open-sourced our dataset on
https://github.com/hongwang600/FLAIR and welcome contributions from the
community to enrich such detection datasets.