使用大型语言模型进行重复游戏

摘要

大型语言模型（LLMs）正在改变社会，并渗透到各种应用中。因此，LLMs将经常与我们和其他代理互动。因此，深入了解LLMs在互动社会环境中的行为具有重要的社会价值。在这里，我们建议使用行为博弈论来研究LLM的合作和协调行为。为此，我们让不同的LLMs（GPT-3、GPT-3.5和GPT-4）相互之间以及与其他类似人类的策略进行有限重复博弈。我们的研究结果显示，LLMs通常在这些任务中表现良好，并揭示了持续的行为特征。在大量的两人-两策略博弈中，我们发现LLMs在像重复囚徒困境家族这样重视自身利益的游戏中表现特别出色。然而，在需要协调的游戏中，它们表现不佳。因此，我们进一步关注这两个不同家族的游戏。在经典的重复囚徒困境中，我们发现GPT-4表现得尤为无情，总是在另一个代理者叛变一次后就叛变。在性别之战中，我们发现GPT-4无法与简单的交替选择的行为相匹配。我们验证这些行为特征在稳健性检查中是稳定的。最后，我们展示了如何通过提供有关另一位玩家的更多信息以及要求其在做出选择之前预测另一位玩家的行动来修改GPT-4的行为。这些结果丰富了我们对LLM社会行为的理解，并为机器行为博弈理论铺平了道路。

English

Large Language Models (LLMs) are transforming society and permeating into diverse applications. As a result, LLMs will frequently interact with us and other agents. It is, therefore, of great societal value to understand how LLMs behave in interactive social settings. Here, we propose to use behavioral game theory to study LLM's cooperation and coordination behavior. To do so, we let different LLMs (GPT-3, GPT-3.5, and GPT-4) play finitely repeated games with each other and with other, human-like strategies. Our results show that LLMs generally perform well in such tasks and also uncover persistent behavioral signatures. In a large set of two players-two strategies games, we find that LLMs are particularly good at games where valuing their own self-interest pays off, like the iterated Prisoner's Dilemma family. However, they behave sub-optimally in games that require coordination. We, therefore, further focus on two games from these distinct families. In the canonical iterated Prisoner's Dilemma, we find that GPT-4 acts particularly unforgivingly, always defecting after another agent has defected only once. In the Battle of the Sexes, we find that GPT-4 cannot match the behavior of the simple convention to alternate between options. We verify that these behavioral signatures are stable across robustness checks. Finally, we show how GPT-4's behavior can be modified by providing further information about the other player as well as by asking it to predict the other player's actions before making a choice. These results enrich our understanding of LLM's social behavior and pave the way for a behavioral game theory for machines.

使用大型语言模型进行重复游戏

Playing repeated games with Large Language Models

摘要

Support