ChatPaper.aiChatPaper

WebWeaver:通过动态大纲构建网络级证据体系,助力开放式深度研究

WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

September 16, 2025
作者: Zijian Li, Xin Guan, Bo Zhang, Shen Huang, Houquan Zhou, Shaopeng Lai, Ming Yan, Yong Jiang, Pengjun Xie, Fei Huang, Jun Zhang, Jingren Zhou
cs.AI

摘要

本文探讨了开放式深度研究(OEDR)这一复杂挑战,其中AI代理需要将海量网络信息综合为富有洞察力的报告。现有方法普遍存在双重局限:静态研究流程将规划与证据获取割裂,而一次性生成范式则易受长上下文失效问题困扰,如“中间信息丢失”和幻觉现象。为应对这些挑战,我们提出了WebWeaver,一种模拟人类研究过程的新型双代理框架。规划器在动态循环中运作,迭代地交织证据获取与大纲优化,生成一个全面、基于来源的大纲,并链接至证据记忆库。随后,写作者执行分层检索与写作流程,逐部分撰写报告。通过针对性地从记忆库中检索每部分所需证据,该框架有效缓解了长上下文问题。我们的框架在包括DeepResearch Bench、DeepConsult和DeepResearchGym在内的主要OEDR基准测试中确立了新的技术标杆。这些结果验证了我们以人为中心、迭代式方法的有效性,表明适应性规划和聚焦式综合对于生成高质量、可靠且结构良好的报告至关重要。
English
This paper tackles open-ended deep research (OEDR), a complex challenge where AI agents must synthesize vast web-scale information into insightful reports. Current approaches are plagued by dual-fold limitations: static research pipelines that decouple planning from evidence acquisition and one-shot generation paradigms that easily suffer from long-context failure issues like "loss in the middle" and hallucinations. To address these challenges, we introduce WebWeaver, a novel dual-agent framework that emulates the human research process. The planner operates in a dynamic cycle, iteratively interleaving evidence acquisition with outline optimization to produce a comprehensive, source-grounded outline linking to a memory bank of evidence. The writer then executes a hierarchical retrieval and writing process, composing the report section by section. By performing targeted retrieval of only the necessary evidence from the memory bank for each part, it effectively mitigates long-context issues. Our framework establishes a new state-of-the-art across major OEDR benchmarks, including DeepResearch Bench, DeepConsult, and DeepResearchGym. These results validate our human-centric, iterative methodology, demonstrating that adaptive planning and focused synthesis are crucial for producing high-quality, reliable, and well-structured reports.
PDF773September 17, 2025