人間か、それとも違うのか？チューリングテストへのゲーミフィケーションアプローチ

要旨

私たちは「Human or Not?」という、チューリングテストに着想を得たオンラインゲームを発表しました。このゲームは、AIチャットボットが人間のように会話する能力と、人間がボットと他の人間を見分ける能力を測定するものです。1ヶ月にわたって、150万人以上のユーザーがこのゲームをプレイし、匿名の2分間チャットセッションに参加しました。セッションの相手は、別の人間か、人間のように振る舞うよう指示されたAI言語モデルのいずれかでした。プレイヤーの課題は、自分が話している相手が人間かAIかを正しく推測することでした。これまでで最大規模のチューリングスタイルのテストは、いくつかの興味深い事実を明らかにしました。例えば、全体としてユーザーはパートナーの正体を68%のゲームでしか正しく推測できませんでした。ユーザーがAIボットと対戦したゲームのサブセットでは、正答率はさらに低く60%（つまり、偶然と大差ないレベル）でした。このホワイトペーパーでは、このユニークな実験の開発、展開、および結果について詳述しています。この実験には多くの拡張と改良が必要ですが、これらの発見はすでに、人間とAIが混在する避けられない近未来に光を当て始めています。

English

We present "Human or Not?", an online game inspired by the Turing test, that measures the capability of AI chatbots to mimic humans in dialog, and of humans to tell bots from other humans. Over the course of a month, the game was played by over 1.5 million users who engaged in anonymous two-minute chat sessions with either another human or an AI language model which was prompted to behave like humans. The task of the players was to correctly guess whether they spoke to a person or to an AI. This largest scale Turing-style test conducted to date revealed some interesting facts. For example, overall users guessed the identity of their partners correctly in only 68% of the games. In the subset of the games in which users faced an AI bot, users had even lower correct guess rates of 60% (that is, not much higher than chance). This white paper details the development, deployment, and results of this unique experiment. While this experiment calls for many extensions and refinements, these findings already begin to shed light on the inevitable near future which will commingle humans and AI.

人間か、それとも違うのか？チューリングテストへのゲーミフィケーションアプローチ

Human or Not? A Gamified Approach to the Turing Test

要旨

Support