使用大型語言模型模塊化地建構合作式具身代理人

摘要

大型語言模型（LLMs）已展示出在各個領域的單一智能體任務中具有令人印象深刻的規劃能力。然而，它們在多智能體合作中的規劃和溝通能力仍不清楚，儘管這些是智能實體智能的關鍵技能。在本文中，我們提出了一個利用LLMs進行多智能體合作的新框架，並在各種實體環境中進行測試。我們的框架使實體智能體能夠有效地規劃、溝通和與其他實體智能體或人類合作，以完成長期任務。我們展示了最近的LLMs，如GPT-4，可以超越強大的基於規劃的方法，並使用我們的框架展現出新興的有效溝通，而無需進行微調或少量提示。我們還發現，使用自然語言溝通的基於LLM的智能體可以贏得更多信任並與人類更有效地合作。我們的研究強調了LLMs在實體人工智能中的潛力，並為未來多智能體合作的研究奠定了基礎。有關視頻可在項目網站https://vis-www.cs.umass.edu/Co-LLM-Agents/上找到。

English

Large Language Models (LLMs) have demonstrated impressive planning abilities in single-agent embodied tasks across various domains. However, their capacity for planning and communication in multi-agent cooperation remains unclear, even though these are crucial skills for intelligent embodied agents. In this paper, we present a novel framework that utilizes LLMs for multi-agent cooperation and tests it in various embodied environments. Our framework enables embodied agents to plan, communicate, and cooperate with other embodied agents or humans to accomplish long-horizon tasks efficiently. We demonstrate that recent LLMs, such as GPT-4, can surpass strong planning-based methods and exhibit emergent effective communication using our framework without requiring fine-tuning or few-shot prompting. We also discover that LLM-based agents that communicate in natural language can earn more trust and cooperate more effectively with humans. Our research underscores the potential of LLMs for embodied AI and lays the foundation for future research in multi-agent cooperation. Videos can be found on the project website https://vis-www.cs.umass.edu/Co-LLM-Agents/.

使用大型語言模型模塊化地建構合作式具身代理人

Building Cooperative Embodied Agents Modularly with Large Language Models

摘要

Support