機器人還是人類?用一個問題檢測 ChatGPT 冒名者
Bot or Human? Detecting ChatGPT Imposters with A Single Question
May 10, 2023
作者: Hong Wang, Xuan Luo, Weizhi Wang, Xifeng Yan
cs.AI
摘要
像ChatGPT這樣的大型語言模型最近展示了令人印象深刻的自然語言理解和生成能力,使得各種應用成為可能,包括翻譯、寫作和閒聊。然而,人們擔心它們可能被惡意使用,例如用於欺詐或阻斷服務攻擊。因此,開發方法來檢測對話中的參與方是機器人還是人類至關重要。在本文中,我們提出了一個名為FLAIR的框架,通過一個詢問和回應來尋找大型語言模型的真實性,以在線方式檢測對話機器人。具體來說,我們針對一個單一問題情境,可以有效區分人類用戶和機器人。問題分為兩類:對人類容易但對機器人困難的問題(例如計數、替換、定位、噪音過濾和ASCII藝術),以及對機器人容易但對人類困難的問題(例如記憶和計算)。我們的方法展示了這些問題在有效性上的不同優勢,為在線服務提供商提供了一種新的方式來保護自己免受惡意活動的侵害,確保他們正在為真實用戶提供服務。我們在https://github.com/hongwang600/FLAIR上公開了我們的數據集,並歡迎社區貢獻以豐富此類檢測數據集。
English
Large language models like ChatGPT have recently demonstrated impressive
capabilities in natural language understanding and generation, enabling various
applications including translation, essay writing, and chit-chatting. However,
there is a concern that they can be misused for malicious purposes, such as
fraud or denial-of-service attacks. Therefore, it is crucial to develop methods
for detecting whether the party involved in a conversation is a bot or a human.
In this paper, we propose a framework named FLAIR, Finding Large language model
Authenticity via a single Inquiry and Response, to detect conversational bots
in an online manner. Specifically, we target a single question scenario that
can effectively differentiate human users from bots. The questions are divided
into two categories: those that are easy for humans but difficult for bots
(e.g., counting, substitution, positioning, noise filtering, and ASCII art),
and those that are easy for bots but difficult for humans (e.g., memorization
and computation). Our approach shows different strengths of these questions in
their effectiveness, providing a new way for online service providers to
protect themselves against nefarious activities and ensure that they are
serving real users. We open-sourced our dataset on
https://github.com/hongwang600/FLAIR and welcome contributions from the
community to enrich such detection datasets.