BOLAA：基于LLM增强的自主代理的基准测试和编排

摘要

大型语言模型（LLMs）的巨大成功鼓舞了对LLM增强自主代理（LAAs）的新探索。LAA能够利用其核心LLM生成动作并与环境进行交互，从而通过根据过去的交互（如观察和动作）来解决复杂任务。由于对LAA的研究仍然非常新颖，可用的探索有限。因此，我们全面比较了LAA的代理架构和LLM骨干。此外，我们提出了一种新策略，以协调多个LAA，使每个劳动LAA专注于一种类型的动作，即BOLAA，其中控制器管理多个代理之间的通信。我们在决策制定和多步推理环境中进行了模拟，全面验证了LAAs的能力。我们的性能结果为设计LAA架构和LLM的最佳选择以及两者的兼容性提供了定量建议。我们将LAA的实现代码发布给公众，网址为https://github.com/salesforce/BOLAA。

English

The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs). An LAA is able to generate actions with its core LLM and interact with environments, which facilitates the ability to resolve complex tasks by conditioning on past interactions such as observations and actions. Since the investigation of LAA is still very recent, limited explorations are available. Therefore, we provide a comprehensive comparison of LAA in terms of both agent architectures and LLM backbones. Additionally, we propose a new strategy to orchestrate multiple LAAs such that each labor LAA focuses on one type of action, i.e. BOLAA, where a controller manages the communication among multiple agents. We conduct simulations on both decision-making and multi-step reasoning environments, which comprehensively justify the capacity of LAAs. Our performance results provide quantitative suggestions for designing LAA architectures and the optimal choice of LLMs, as well as the compatibility of both. We release our implementation code of LAAs to the public at https://github.com/salesforce/BOLAA.

BOLAA：基于LLM增强的自主代理的基准测试和编排

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents

摘要

Support