BOLAA: 대규모 언어 모델 기반 자율 에이전트 벤치마킹 및 오케스트레이션

초록

대규모 언어 모델(LLM)의 엄청난 성공은 LLM 기반 자율 에이전트(LAA)에 대한 새로운 탐구를 촉진하고 있습니다. LAA는 핵심 LLM을 통해 행동을 생성하고 환경과 상호작용할 수 있으며, 이를 통해 관찰과 행동과 같은 과거 상호작용을 조건으로 복잡한 작업을 해결하는 능력을 강화합니다. LAA에 대한 연구는 아직 초기 단계이기 때문에 제한된 탐구만이 이루어져 왔습니다. 따라서 본 논문에서는 에이전트 아키텍처와 LLM 백본 측면에서 LAA에 대한 포괄적인 비교를 제공합니다. 또한, 각 작업 LAA가 한 가지 유형의 행동에 집중하도록 다중 LAA를 조율하는 새로운 전략인 BOLAA를 제안합니다. 여기서 컨트롤러는 다중 에이전트 간의 통신을 관리합니다. 의사결정 및 다단계 추론 환경에서 시뮬레이션을 수행하여 LAA의 능력을 종합적으로 입증합니다. 성능 결과는 LAA 아키텍처 설계와 LLM의 최적 선택, 그리고 이 둘의 호환성에 대한 정량적 제안을 제공합니다. LAA 구현 코드는 https://github.com/salesforce/BOLAA에서 공개합니다.

English

The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs). An LAA is able to generate actions with its core LLM and interact with environments, which facilitates the ability to resolve complex tasks by conditioning on past interactions such as observations and actions. Since the investigation of LAA is still very recent, limited explorations are available. Therefore, we provide a comprehensive comparison of LAA in terms of both agent architectures and LLM backbones. Additionally, we propose a new strategy to orchestrate multiple LAAs such that each labor LAA focuses on one type of action, i.e. BOLAA, where a controller manages the communication among multiple agents. We conduct simulations on both decision-making and multi-step reasoning environments, which comprehensively justify the capacity of LAAs. Our performance results provide quantitative suggestions for designing LAA architectures and the optimal choice of LLMs, as well as the compatibility of both. We release our implementation code of LAAs to the public at https://github.com/salesforce/BOLAA.

BOLAA: 대규모 언어 모델 기반 자율 에이전트 벤치마킹 및 오케스트레이션

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents

초록

Support