エージェントデータプロトコル：多様で効果的なLLMエージェントのファインチューニングのためのデータセット統合

要旨

大規模な教師ありファインチューニングによるAIエージェントの研究結果は、公開されているものが比較的少ない状況です。これは、エージェントの訓練データの収集が独特の課題を伴うためです。本研究では、このボトルネックが基盤となるデータ源の不足ではなく、多様なデータが異種混合の形式、ツール、インターフェースに散在している点にあると主張します。この問題に対処するため、我々はエージェントデータプロトコル（ADP）を提案します。ADPは軽量な表現言語であり、多様な形式のエージェントデータセットと、下流で統一されたエージェント訓練パイプラインとの間の「中間言語」として機能します。ADPの設計は、API/ツール使用、ブラウジング、コーディング、ソフトウェアエンジニアリング、一般的なエージェントワークフローなど、多岐にわたるタスクを捕捉するのに十分な表現力を持ちつつ、データセットごとのエンジニアリングを必要とせず、簡単に解析・訓練できるように簡素さを保っています。実験では、13の既存のエージェント訓練データセットをADP形式に統一し、標準化されたADPデータを複数のエージェントフレームワーク向けの訓練対応形式に変換しました。これらのデータを用いて教師ありファインチューニングを実施した結果、対応するベースモデルと比較して平均約20%の性能向上を実証し、コーディング、ブラウジング、ツール使用、研究ベンチマークにおいて、ドメイン固有の調整なしに、State-of-the-Artまたはそれに迫る性能を達成しました。すべてのコードとデータを公開しており、ADPが標準化された、スケーラブルで再現性のあるエージェント訓練への参入障壁を下げる一助となることを期待しています。

English

Public research results on large-scale supervised finetuning of AI agents remain relatively rare, since the collection of agent training data presents unique challenges. In this work, we argue that the bottleneck is not a lack of underlying data sources, but that a large variety of data is fragmented across heterogeneous formats, tools, and interfaces. To this end, we introduce the agent data protocol (ADP), a light-weight representation language that serves as an "interlingua" between agent datasets in diverse formats and unified agent training pipelines downstream. The design of ADP is expressive enough to capture a large variety of tasks, including API/tool use, browsing, coding, software engineering, and general agentic workflows, while remaining simple to parse and train on without engineering at a per-dataset level. In experiments, we unified a broad collection of 13 existing agent training datasets into ADP format, and converted the standardized ADP data into training-ready formats for multiple agent frameworks. We performed SFT on these data, and demonstrated an average performance gain of ~20% over corresponding base models, and delivers state-of-the-art or near-SOTA performance on standard coding, browsing, tool use, and research benchmarks, without domain-specific tuning. All code and data are released publicly, in the hope that ADP could help lower the barrier to standardized, scalable, and reproducible agent training.

エージェントデータプロトコル：多様で効果的なLLMエージェントのファインチューニングのためのデータセット統合

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

要旨

Support