WebResearcher：释放长程智能体中的无限推理能力

摘要

近期深度研究系统的进展展现了AI代理从外部资源自主发现与综合知识的潜力。本文介绍WebResearcher，一个构建此类代理的创新框架，其核心包含两大组件：(1) WebResearcher，一种迭代式深度研究范式，将深度研究重构为马尔可夫决策过程，在此过程中，代理定期将发现整合至不断演进的报告中，同时保持专注的工作空间，有效克服了现有单上下文方法中常见的上下文窒息与噪声污染问题；(2) WebFrontier，一个可扩展的数据合成引擎，通过工具增强的复杂度提升生成高质量训练数据，系统性地创建研究任务，弥合被动知识回忆与主动知识构建之间的鸿沟。值得注意的是，我们发现该范式生成的训练数据显著提升了传统单上下文方法的工具使用能力。此外，该范式通过并行思维自然扩展，支持多代理并发探索，以得出更为全面的结论。在六大挑战性基准上的广泛实验表明，WebResearcher实现了最先进的性能，甚至超越了前沿的专有系统。

English

Recent advances in deep-research systems have demonstrated the potential for AI agents to autonomously discover and synthesize knowledge from external sources. In this paper, we introduce WebResearcher, a novel framework for building such agents through two key components: (1) WebResearcher, an iterative deep-research paradigm that reformulates deep research as a Markov Decision Process, where agents periodically consolidate findings into evolving reports while maintaining focused workspaces, overcoming the context suffocation and noise contamination that plague existing mono-contextual approaches; and (2) WebFrontier, a scalable data synthesis engine that generates high-quality training data through tool-augmented complexity escalation, enabling systematic creation of research tasks that bridge the gap between passive knowledge recall and active knowledge construction. Notably, we find that the training data from our paradigm significantly enhances tool-use capabilities even for traditional mono-contextual methods. Furthermore, our paradigm naturally scales through parallel thinking, enabling concurrent multi-agent exploration for more comprehensive conclusions. Extensive experiments across 6 challenging benchmarks demonstrate that WebResearcher achieves state-of-the-art performance, even surpassing frontier proprietary systems.

WebResearcher：释放长程智能体中的无限推理能力

WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents

摘要

Support