通过自我对弈和从人工智能反馈中学习，提升语言模型的协商能力

摘要

我们研究多个大型语言模型（LLMs）是否能通过玩耍、反思和批评在谈判游戏中自主改进彼此。我们对这个问题感兴趣，因为如果LLMs能够相互改进，那将意味着可以在最小程度的人为干预下创造强大的人工智能代理。我们让两个LLMs扮演买方和卖方的角色进行谈判，他们的目标是让买方以更低的价格和卖方以更高的价格达成交易。第三个语言模型扮演批评者的角色，为玩家提供反馈以改进谈判策略。我们让这两个代理进行多轮游戏，利用先前的谈判历史和人工智能反馈作为上下文演示，迭代地改进模型的谈判策略。我们为不同的角色使用不同的LLMs（GPT和Claude），以交易价格作为评估指标。我们的实验揭示了多个有趣的发现：（1）我们考虑的语言模型中只有一部分能够自我对弈并通过人工智能反馈改进交易价格，较弱的模型要么不理解游戏规则，要么无法将人工智能反馈纳入进一步改进。（2）模型从反馈中学习的能力在扮演不同角色时有所不同。例如，Claude-instant作为买方比作为卖方更难改进。（3）当将游戏展开到多轮时，更强大的代理可以通过有意义地利用先前经验和迭代的人工智能反馈持续改进表现，但也更容易破坏交易。我们希望我们的工作提供了对通过游戏和人工智能反馈自主改进模型的初步探索的深刻见解。

English

We study whether multiple large language models (LLMs) can autonomously improve each other in a negotiation game by playing, reflecting, and criticizing. We are interested in this question because if LLMs were able to improve each other, it would imply the possibility of creating strong AI agents with minimal human intervention. We ask two LLMs to negotiate with each other, playing the roles of a buyer and a seller, respectively. They aim to reach a deal with the buyer targeting a lower price and the seller a higher one. A third language model, playing the critic, provides feedback to a player to improve the player's negotiation strategies. We let the two agents play multiple rounds, using previous negotiation history and AI feedback as in-context demonstrations to improve the model's negotiation strategy iteratively. We use different LLMs (GPT and Claude) for different roles and use the deal price as the evaluation metric. Our experiments reveal multiple intriguing findings: (1) Only a subset of the language models we consider can self-play and improve the deal price from AI feedback, weaker models either do not understand the game's rules or cannot incorporate AI feedback for further improvement. (2) Models' abilities to learn from the feedback differ when playing different roles. For example, it is harder for Claude-instant to improve as the buyer than as the seller. (3) When unrolling the game to multiple rounds, stronger agents can consistently improve their performance by meaningfully using previous experiences and iterative AI feedback, yet have a higher risk of breaking the deal. We hope our work provides insightful initial explorations of having models autonomously improve each other with game playing and AI feedback.

通过自我对弈和从人工智能反馈中学习，提升语言模型的协商能力

Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

摘要

Support