通過自我對弈和從人工智慧反饋中學習，提升語言模型協商。

摘要

我們研究多個大型語言模型（LLMs）是否能透過遊玩、反思和批評在談判遊戲中自主地互相改進。我們對這個問題感興趣，因為如果LLMs能夠互相改進，這將意味著可以創建強大的人工智能代理，並減少人類的介入。我們要求兩個LLMs互相協商，分別扮演買家和賣家的角色。他們的目標是達成一項交易，買家希望價格更低，而賣家則希望價格更高。第三個語言模型擔任評論家，提供反饋給玩家以改進其談判策略。我們讓這兩個代理進行多輪遊戲，使用先前的談判歷史和人工智能反饋作為上下文示範，以迭代方式改進模型的談判策略。我們使用不同的LLMs（GPT和Claude）擔任不同角色，並以交易價格作為評估指標。我們的實驗揭示了多個有趣的發現：（1）我們考慮的語言模型中只有部分能夠自我遊玩並從人工智能反饋中改進交易價格，較弱的模型要麼不理解遊戲規則，要麼無法將人工智能反饋納入進一步改進。（2）模型從反饋中學習的能力在扮演不同角色時有所不同。例如，Claude-instant作為買家比作為賣家更難改進。（3）當將遊戲延展到多輪時，更強大的代理可以通過有意義地利用先前的經驗和迭代的人工智能反饋來持續改進其表現，但也更容易破壞交易。我們希望我們的工作提供了對模型如何通過遊戲和人工智能反饋自主改進的深入初步探索。

English

We study whether multiple large language models (LLMs) can autonomously improve each other in a negotiation game by playing, reflecting, and criticizing. We are interested in this question because if LLMs were able to improve each other, it would imply the possibility of creating strong AI agents with minimal human intervention. We ask two LLMs to negotiate with each other, playing the roles of a buyer and a seller, respectively. They aim to reach a deal with the buyer targeting a lower price and the seller a higher one. A third language model, playing the critic, provides feedback to a player to improve the player's negotiation strategies. We let the two agents play multiple rounds, using previous negotiation history and AI feedback as in-context demonstrations to improve the model's negotiation strategy iteratively. We use different LLMs (GPT and Claude) for different roles and use the deal price as the evaluation metric. Our experiments reveal multiple intriguing findings: (1) Only a subset of the language models we consider can self-play and improve the deal price from AI feedback, weaker models either do not understand the game's rules or cannot incorporate AI feedback for further improvement. (2) Models' abilities to learn from the feedback differ when playing different roles. For example, it is harder for Claude-instant to improve as the buyer than as the seller. (3) When unrolling the game to multiple rounds, stronger agents can consistently improve their performance by meaningfully using previous experiences and iterative AI feedback, yet have a higher risk of breaking the deal. We hope our work provides insightful initial explorations of having models autonomously improve each other with game playing and AI feedback.

通過自我對弈和從人工智慧反饋中學習，提升語言模型協商。

Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

摘要

Support