AgentFrontier:通过最近发展区引导的数据合成拓展LLM智能体的能力边界 (注:ZPD为"Zone of Proximal Development"(最近发展区)的缩写,是维果茨基提出的教育心理学概念,指学习者当前实际发展水平与潜在发展水平之间的距离。此处采用心理学界通用译法"最近发展区"。)
AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis
October 28, 2025
作者: Xuanzhong Chen, Zile Qiao, Guoxin Chen, Liangcai Su, Zhen Zhang, Xinyu Wang, Pengjun Xie, Fei Huang, Jingren Zhou, Yong Jiang
cs.AI
摘要
在大型语言模型能力边界的前沿任务上训练智能体,是解锁高级推理能力的关键。我们受"最近发展区"教育理论启发,提出一种数据合成方法——该理论将能力边界定义为语言模型虽无法独立解决、但能在引导下掌握的任务。为实现这一理念,我们推出AgentFrontier引擎:一个自动化流水线系统,能精准生成位于语言模型最近发展区的多学科高质量数据。该引擎既支持基于知识密集型数据的持续预训练,也支持针对复杂推理任务的定向后训练。基于同一框架,我们开发出ZPD测评体系——一个动态自动化基准测试平台,专门用于评估智能体在前沿任务上的表现。通过使用合成数据训练的AgentFrontier-30B-A3B模型,在《人类终极考试》等高难度基准测试中取得了领先成果,甚至超越部分主流专有智能体。我们的研究表明,以最近发展区为指导的数据合成方法,为构建更强能力的语言模型智能体提供了可扩展的有效路径。
English
Training large language model agents on tasks at the frontier of their
capabilities is key to unlocking advanced reasoning. We introduce a data
synthesis approach inspired by the educational theory of the Zone of Proximal
Development (ZPD), which defines this frontier as tasks an LLM cannot solve
alone but can master with guidance. To operationalize this, we present the
AgentFrontier Engine, an automated pipeline that synthesizes high-quality,
multidisciplinary data situated precisely within the LLM's ZPD. This engine
supports both continued pre-training with knowledge-intensive data and targeted
post-training on complex reasoning tasks. From the same framework, we derive
the ZPD Exam, a dynamic and automated benchmark designed to evaluate agent
capabilities on these frontier tasks. We train AgentFrontier-30B-A3B model on
our synthesized data, which achieves state-of-the-art results on demanding
benchmarks like Humanity's Last Exam, even surpassing some leading proprietary
agents. Our work demonstrates that a ZPD-guided approach to data synthesis
offers a scalable and effective path toward building more capable LLM agents.