ボットか人間か？たった一つの質問でChatGPTのなりすましを検出

要旨

ChatGPTのような大規模言語モデルは、最近、自然言語理解と生成において印象的な能力を示し、翻訳、エッセイ執筆、雑談など様々な応用を可能にしています。しかし、詐欺やサービス拒否攻撃などの悪意ある目的に悪用される懸念もあります。そのため、会話の相手がボットか人間かを検出する方法を開発することが極めて重要です。本論文では、FLAIR（Finding Large language model Authenticity via a single Inquiry and Response）というフレームワークを提案し、オンラインで会話ボットを検出します。具体的には、人間のユーザーとボットを効果的に区別できる単一の質問シナリオを対象とします。質問は、人間には簡単だがボットには難しいもの（例：カウント、置換、位置特定、ノイズフィルタリング、ASCIIアート）と、ボットには簡単だが人間には難しいもの（例：記憶と計算）の2つのカテゴリに分けられます。我々のアプローチは、これらの質問の有効性における異なる強みを示し、オンラインサービスプロバイダーが悪意のある活動から身を守り、実際のユーザーにサービスを提供するための新たな方法を提供します。データセットはhttps://github.com/hongwang600/FLAIRで公開しており、コミュニティからの貢献を歓迎し、このような検出データセットを充実させていきます。

English

Large language models like ChatGPT have recently demonstrated impressive capabilities in natural language understanding and generation, enabling various applications including translation, essay writing, and chit-chatting. However, there is a concern that they can be misused for malicious purposes, such as fraud or denial-of-service attacks. Therefore, it is crucial to develop methods for detecting whether the party involved in a conversation is a bot or a human. In this paper, we propose a framework named FLAIR, Finding Large language model Authenticity via a single Inquiry and Response, to detect conversational bots in an online manner. Specifically, we target a single question scenario that can effectively differentiate human users from bots. The questions are divided into two categories: those that are easy for humans but difficult for bots (e.g., counting, substitution, positioning, noise filtering, and ASCII art), and those that are easy for bots but difficult for humans (e.g., memorization and computation). Our approach shows different strengths of these questions in their effectiveness, providing a new way for online service providers to protect themselves against nefarious activities and ensure that they are serving real users. We open-sourced our dataset on https://github.com/hongwang600/FLAIR and welcome contributions from the community to enrich such detection datasets.

ボットか人間か？たった一つの質問でChatGPTのなりすましを検出

Bot or Human? Detecting ChatGPT Imposters with A Single Question

要旨

Support