使用大型语言模型模块化地构建合作的具身代理程序

摘要

大型语言模型（LLMs）已经展示了在各个领域的单一智能体任务中具有令人印象深刻的规划能力。然而，它们在多智能体合作中的规划和沟通能力尚不清楚，尽管这些是智能体必不可少的技能。在本文中，我们提出了一个新颖的框架，利用LLMs进行多智能体合作，并在各种具体环境中进行测试。我们的框架使具体智能体能够有效地规划、沟通和与其他具体智能体或人类合作，以完成长期任务。我们展示了最近的LLMs，如GPT-4，可以超越强大的基于规划的方法，并利用我们的框架展现出新兴的有效沟通，而无需进行微调或少量提示。我们还发现，使用自然语言进行沟通的基于LLM的智能体可以赢得更多信任，并更有效地与人类合作。我们的研究强调了LLMs在具体人工智能方面的潜力，并为未来的多智能体合作研究奠定了基础。项目网站https://vis-www.cs.umass.edu/Co-LLM-Agents/上可以找到视频。

English

Large Language Models (LLMs) have demonstrated impressive planning abilities in single-agent embodied tasks across various domains. However, their capacity for planning and communication in multi-agent cooperation remains unclear, even though these are crucial skills for intelligent embodied agents. In this paper, we present a novel framework that utilizes LLMs for multi-agent cooperation and tests it in various embodied environments. Our framework enables embodied agents to plan, communicate, and cooperate with other embodied agents or humans to accomplish long-horizon tasks efficiently. We demonstrate that recent LLMs, such as GPT-4, can surpass strong planning-based methods and exhibit emergent effective communication using our framework without requiring fine-tuning or few-shot prompting. We also discover that LLM-based agents that communicate in natural language can earn more trust and cooperate more effectively with humans. Our research underscores the potential of LLMs for embodied AI and lays the foundation for future research in multi-agent cooperation. Videos can be found on the project website https://vis-www.cs.umass.edu/Co-LLM-Agents/.

使用大型语言模型模块化地构建合作的具身代理程序

Building Cooperative Embodied Agents Modularly with Large Language Models

摘要

Support