BOLAA：LLM增強自主代理人的基準測試與協調

摘要

大型語言模型（LLMs）的巨大成功鼓勵了對LLM增強自主代理（LAAs）的新興探索。一個LAA能夠利用其核心LLM生成動作並與環境互動，這有助於通過條件化過去的互動（如觀察和動作）來解決複雜任務。由於對LAA的研究仍然非常新穎，可用的探索有限。因此，我們對LAA在代理架構和LLM骨幹方面進行了全面比較。此外，我們提出了一種新策略，可以協調多個LAA，使每個勞動LAA專注於一種類型的動作，即BOLAA，其中一個控制器管理多個代理之間的通信。我們在決策和多步推理環境中進行模擬，全面證明了LAAs的能力。我們的性能結果為設計LAA架構、LLM的最佳選擇以及兩者的兼容性提供了量化建議。我們將我們的LAA實現代碼釋出給公眾，網址為https://github.com/salesforce/BOLAA。

English

The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs). An LAA is able to generate actions with its core LLM and interact with environments, which facilitates the ability to resolve complex tasks by conditioning on past interactions such as observations and actions. Since the investigation of LAA is still very recent, limited explorations are available. Therefore, we provide a comprehensive comparison of LAA in terms of both agent architectures and LLM backbones. Additionally, we propose a new strategy to orchestrate multiple LAAs such that each labor LAA focuses on one type of action, i.e. BOLAA, where a controller manages the communication among multiple agents. We conduct simulations on both decision-making and multi-step reasoning environments, which comprehensively justify the capacity of LAAs. Our performance results provide quantitative suggestions for designing LAA architectures and the optimal choice of LLMs, as well as the compatibility of both. We release our implementation code of LAAs to the public at https://github.com/salesforce/BOLAA.

BOLAA：LLM增強自主代理人的基準測試與協調

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents

摘要

Support