TaskCraft: エージェント型タスクの自動生成

要旨

自律性、ツール使用、適応的推論を必要とする多段階の問題解決を要するエージェント的タスクは、NLPおよびAIの進歩においてますます中心的な役割を果たしつつある。しかし、既存の指示データにはツールインタラクションが欠けており、現在のエージェント的ベンチマークはコストのかかる人間によるアノテーションに依存しているため、スケーラビリティが制限されている。本論文では、難易度をスケーラブルに調整可能で、複数ツールを使用し、検証可能なエージェント的タスクとその実行軌跡を自動生成するワークフローであるTaskCraftを紹介する。TaskCraftは、深さベースおよび幅ベースの拡張を用いて原子タスクを拡張し、構造的かつ階層的に複雑な課題を作成する。実験結果は、これらのタスクが生成ワークフローにおけるプロンプト最適化を改善し、エージェント的基盤モデルの教師ありファインチューニングを強化することを示している。我々は、約36,000の異なる難易度のタスクからなる大規模な合成データセットを提示し、将来のエージェントチューニングおよび評価研究を支援する。

English

Agentic tasks, which require multi-step problem solving with autonomy, tool use, and adaptive reasoning, are becoming increasingly central to the advancement of NLP and AI. However, existing instruction data lacks tool interaction, and current agentic benchmarks rely on costly human annotation, limiting their scalability. We introduce TaskCraft, an automated workflow for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories. TaskCraft expands atomic tasks using depth-based and width-based extensions to create structurally and hierarchically complex challenges. Empirical results show that these tasks improve prompt optimization in the generation workflow and enhance supervised fine-tuning of agentic foundation models. We present a large-scale synthetic dataset of approximately 36,000 tasks with varying difficulty to support future research on agent tuning and evaluation.

TaskCraft: エージェント型タスクの自動生成

TaskCraft: Automated Generation of Agentic Tasks

要旨

Support