TPTU: 大規模言語モデルベースAIエージェントのタスク計画とツール使用

要旨

近年の自然言語処理の進展に伴い、大規模言語モデル（LLMs）は様々な実世界のアプリケーションにおいて強力なツールとして登場しています。しかし、その能力にもかかわらず、LLMsの内在的な生成能力は、タスク計画と外部ツールの使用を組み合わせる必要がある複雑なタスクを扱うには不十分である可能性があります。本論文では、まずLLMベースのAIエージェントに特化した構造化フレームワークを提案し、複雑な問題に対処するために必要な重要な能力について議論します。このフレームワーク内で、推論プロセスを実行するために2つの異なるタイプのエージェント（すなわち、ワンステップエージェントとシーケンシャルエージェント）を設計します。その後、様々なLLMsを用いてこのフレームワークを具体化し、典型的なタスクにおけるタスク計画とツール使用（TPTU）能力を評価します。主要な発見と課題を強調することで、研究者や実務者がAIアプリケーションでLLMsの力を活用するための有用なリソースを提供することを目指します。本研究は、これらのモデルの大きな可能性を強調すると同時に、さらなる調査と改善が必要な領域を特定しています。

English

With recent advancements in natural language processing, Large Language Models (LLMs) have emerged as powerful tools for various real-world applications. Despite their prowess, the intrinsic generative abilities of LLMs may prove insufficient for handling complex tasks which necessitate a combination of task planning and the usage of external tools. In this paper, we first propose a structured framework tailored for LLM-based AI Agents and discuss the crucial capabilities necessary for tackling intricate problems. Within this framework, we design two distinct types of agents (i.e., one-step agent and sequential agent) to execute the inference process. Subsequently, we instantiate the framework using various LLMs and evaluate their Task Planning and Tool Usage (TPTU) abilities on typical tasks. By highlighting key findings and challenges, our goal is to provide a helpful resource for researchers and practitioners to leverage the power of LLMs in their AI applications. Our study emphasizes the substantial potential of these models, while also identifying areas that need more investigation and improvement.

TPTU: 大規模言語モデルベースAIエージェントのタスク計画とツール使用

TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents

要旨

Support