策略师:通过双层树搜索让LLMs学习战略技能
Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search
August 20, 2024
作者: Jonathan Light, Min Cai, Weiqin Chen, Guanzhi Wang, Xiusi Chen, Wei Cheng, Yisong Yue, Ziniu Hu
cs.AI
摘要
本文提出了一种名为Strategist的新方法,利用LLM来通过自我改进过程获取在多智能体游戏中发挥新技能的能力。我们的方法通过自我对弈模拟和基于Monte Carlo树搜索和LLM反思来收集高质量反馈,然后利用这些反馈来学习高级战略技能,比如如何评估指导低级执行的状态。我们展示了我们的方法如何在游戏行动规划和对话生成中发挥作用,在这些任务中取得了良好的表现。具体来说,我们证明了我们的方法可以帮助训练出表现优于传统基于强化学习方法和其他基于LLM技能学习方法的代理的代理,在包括纯策略博弈(GOPS)和《抵抗组织:阿瓦隆》在内的游戏中。
English
In this paper, we propose a new method Strategist that utilizes LLMs to
acquire new skills for playing multi-agent games through a self-improvement
process. Our method gathers quality feedback through self-play simulations with
Monte Carlo tree search and LLM-based reflection, which can then be used to
learn high-level strategic skills such as how to evaluate states that guide the
low-level execution.We showcase how our method can be used in both action
planning and dialogue generation in the context of games, achieving good
performance on both tasks. Specifically, we demonstrate that our method can
help train agents with better performance than both traditional reinforcement
learning-based approaches and other LLM-based skill learning approaches in
games including the Game of Pure Strategy (GOPS) and The Resistance: Avalon.Summary
AI-Generated Summary