WebResearcher:釋放長時程代理的無限推理能力
WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents
September 16, 2025
作者: Zile Qiao, Guoxin Chen, Xuanzhong Chen, Donglei Yu, Wenbiao Yin, Xinyu Wang, Zhen Zhang, Baixuan Li, Huifeng Yin, Kuan Li, Rui Min, Minpeng Liao, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou
cs.AI
摘要
深度研究系統的最新進展展示了AI代理從外部來源自主發現和綜合知識的潛力。本文介紹了WebResearcher,這是一個通過兩個關鍵組件構建此類代理的新穎框架:(1) WebResearcher,一種將深度研究重新定義為馬可夫決策過程的迭代深度研究範式,在此過程中,代理定期將發現整合到不斷演進的報告中,同時保持專注的工作空間,克服了現有單一上下文方法中常見的上下文窒息和噪音污染問題;(2) WebFrontier,一個可擴展的數據合成引擎,通過工具增強複雜性升級生成高質量訓練數據,使研究任務的系統化創建成為可能,從而彌補被動知識回憶與主動知識構建之間的差距。值得注意的是,我們發現來自此範式的訓練數據顯著提升了傳統單一上下文方法的工具使用能力。此外,我們的範式通過平行思維自然擴展,支持並行的多代理探索,以獲得更全面的結論。在6個具有挑戰性的基準測試中進行的廣泛實驗表明,WebResearcher達到了最先進的性能,甚至超越了前沿的專有系統。
English
Recent advances in deep-research systems have demonstrated the potential for
AI agents to autonomously discover and synthesize knowledge from external
sources. In this paper, we introduce WebResearcher, a novel framework for
building such agents through two key components: (1) WebResearcher, an
iterative deep-research paradigm that reformulates deep research as a Markov
Decision Process, where agents periodically consolidate findings into evolving
reports while maintaining focused workspaces, overcoming the context
suffocation and noise contamination that plague existing mono-contextual
approaches; and (2) WebFrontier, a scalable data synthesis engine that
generates high-quality training data through tool-augmented complexity
escalation, enabling systematic creation of research tasks that bridge the gap
between passive knowledge recall and active knowledge construction. Notably, we
find that the training data from our paradigm significantly enhances tool-use
capabilities even for traditional mono-contextual methods. Furthermore, our
paradigm naturally scales through parallel thinking, enabling concurrent
multi-agent exploration for more comprehensive conclusions. Extensive
experiments across 6 challenging benchmarks demonstrate that WebResearcher
achieves state-of-the-art performance, even surpassing frontier proprietary
systems.