TPTU: 대형 언어 모델 기반 AI 에이전트의 작업 계획 및 도구 활용

초록

최근 자연어 처리 분야의 발전과 함께, 대규모 언어 모델(LLMs)은 다양한 실제 응용 프로그램에서 강력한 도구로 부상했습니다. 그러나 이러한 모델의 뛰어난 능력에도 불구하고, 복잡한 작업을 처리하기 위해서는 작업 계획과 외부 도구 사용의 조합이 필요한 경우, LLMs의 내재적 생성 능력만으로는 부족할 수 있습니다. 본 논문에서는 먼저 LLM 기반 AI 에이전트에 맞춤화된 구조화된 프레임워크를 제안하고, 복잡한 문제를 해결하기 위해 필요한 핵심 능력에 대해 논의합니다. 이 프레임워크 내에서, 우리는 추론 과정을 실행하기 위해 두 가지 유형의 에이전트(즉, 단일 단계 에이전트와 순차적 에이전트)를 설계합니다. 이후, 다양한 LLMs를 사용하여 이 프레임워크를 구체화하고, 전형적인 작업에 대한 작업 계획 및 도구 사용(TPTU) 능력을 평가합니다. 주요 발견과 도전 과제를 강조함으로써, 우리는 연구자와 실무자가 AI 응용 프로그램에서 LLMs의 힘을 활용할 수 있도록 유용한 자료를 제공하는 것을 목표로 합니다. 본 연구는 이러한 모델의 상당한 잠재력을 강조하는 동시에, 더 많은 조사와 개선이 필요한 영역을 식별합니다.

English

With recent advancements in natural language processing, Large Language Models (LLMs) have emerged as powerful tools for various real-world applications. Despite their prowess, the intrinsic generative abilities of LLMs may prove insufficient for handling complex tasks which necessitate a combination of task planning and the usage of external tools. In this paper, we first propose a structured framework tailored for LLM-based AI Agents and discuss the crucial capabilities necessary for tackling intricate problems. Within this framework, we design two distinct types of agents (i.e., one-step agent and sequential agent) to execute the inference process. Subsequently, we instantiate the framework using various LLMs and evaluate their Task Planning and Tool Usage (TPTU) abilities on typical tasks. By highlighting key findings and challenges, our goal is to provide a helpful resource for researchers and practitioners to leverage the power of LLMs in their AI applications. Our study emphasizes the substantial potential of these models, while also identifying areas that need more investigation and improvement.

TPTU: 대형 언어 모델 기반 AI 에이전트의 작업 계획 및 도구 활용

TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents

초록

Support