BOLAA:LLM增強自主代理人的基準測試與協調
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
August 11, 2023
作者: Zhiwei Liu, Weiran Yao, Jianguo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, Ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese
cs.AI
摘要
大型語言模型(LLMs)的巨大成功鼓勵了對LLM增強自主代理(LAAs)的新興探索。一個LAA能夠利用其核心LLM生成動作並與環境互動,這有助於通過條件化過去的互動(如觀察和動作)來解決複雜任務。由於對LAA的研究仍然非常新穎,可用的探索有限。因此,我們對LAA在代理架構和LLM骨幹方面進行了全面比較。此外,我們提出了一種新策略,可以協調多個LAA,使每個勞動LAA專注於一種類型的動作,即BOLAA,其中一個控制器管理多個代理之間的通信。我們在決策和多步推理環境中進行模擬,全面證明了LAAs的能力。我們的性能結果為設計LAA架構、LLM的最佳選擇以及兩者的兼容性提供了量化建議。我們將我們的LAA實現代碼釋出給公眾,網址為https://github.com/salesforce/BOLAA。
English
The massive successes of large language models (LLMs) encourage the emerging
exploration of LLM-augmented Autonomous Agents (LAAs). An LAA is able to
generate actions with its core LLM and interact with environments, which
facilitates the ability to resolve complex tasks by conditioning on past
interactions such as observations and actions. Since the investigation of LAA
is still very recent, limited explorations are available. Therefore, we provide
a comprehensive comparison of LAA in terms of both agent architectures and LLM
backbones. Additionally, we propose a new strategy to orchestrate multiple LAAs
such that each labor LAA focuses on one type of action, i.e. BOLAA,
where a controller manages the communication among multiple agents. We conduct
simulations on both decision-making and multi-step reasoning environments,
which comprehensively justify the capacity of LAAs. Our performance results
provide quantitative suggestions for designing LAA architectures and the
optimal choice of LLMs, as well as the compatibility of both. We release our
implementation code of LAAs to the public at
https://github.com/salesforce/BOLAA.