TaskCraft:自动化生成代理任务
TaskCraft: Automated Generation of Agentic Tasks
June 11, 2025
作者: Dingfeng Shi, Jingyi Cao, Qianben Chen, Weichen Sun, Weizhen Li, Hongxuan Lu, Fangchen Dong, Tianrui Qin, King Zhu, Minghao Yang, Jian Yang, Ge Zhang, Jiaheng Liu, Changwang Zhang, Jun Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou
cs.AI
摘要
需要自主性、工具使用和适应性推理的多步骤问题解决的代理任务,正日益成为推动NLP和AI发展的核心。然而,现有的指令数据缺乏工具交互,且当前的代理基准依赖于昂贵的人工标注,限制了其可扩展性。我们推出了TaskCraft,一个自动化工作流,用于生成难度可扩展、多工具且可验证的代理任务及其执行轨迹。TaskCraft通过基于深度和广度的扩展,将原子任务扩展为结构和层次上复杂的挑战。实证结果表明,这些任务在生成工作流中优化了提示,并增强了代理基础模型的监督微调效果。我们提供了一个包含约36,000个不同难度任务的大规模合成数据集,以支持未来关于代理调优和评估的研究。
English
Agentic tasks, which require multi-step problem solving with autonomy, tool
use, and adaptive reasoning, are becoming increasingly central to the
advancement of NLP and AI. However, existing instruction data lacks tool
interaction, and current agentic benchmarks rely on costly human annotation,
limiting their scalability. We introduce TaskCraft, an automated
workflow for generating difficulty-scalable, multi-tool, and verifiable agentic
tasks with execution trajectories. TaskCraft expands atomic tasks using
depth-based and width-based extensions to create structurally and
hierarchically complex challenges. Empirical results show that these tasks
improve prompt optimization in the generation workflow and enhance supervised
fine-tuning of agentic foundation models. We present a large-scale synthetic
dataset of approximately 36,000 tasks with varying difficulty to support future
research on agent tuning and evaluation.