ChatPaper.aiChatPaper

WebWeaver:利用動態大綱結構化網絡規模證據,支持開放式深度研究

WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

September 16, 2025
作者: Zijian Li, Xin Guan, Bo Zhang, Shen Huang, Houquan Zhou, Shaopeng Lai, Ming Yan, Yong Jiang, Pengjun Xie, Fei Huang, Jun Zhang, Jingren Zhou
cs.AI

摘要

本文探討開放式深度研究(OEDR),這是一項複雜的挑戰,要求AI代理將海量的網絡規模信息綜合為具有洞察力的報告。當前方法面臨雙重限制:靜態的研究流程將規劃與證據獲取分離,以及一次性生成模式容易受到長上下文失敗問題的影響,如「中間遺失」和幻覺。為應對這些挑戰,我們引入了WebWeaver,一種模擬人類研究過程的創新雙代理框架。規劃者在動態循環中運作,迭代地交織證據獲取與大綱優化,以生成一個全面、基於來源的大綱,並連結到證據的記憶庫。接著,寫作者執行分層檢索與寫作過程,逐節編寫報告。通過針對性地從記憶庫中檢索每部分所需的證據,它有效地緩解了長上下文問題。我們的框架在多個主要OEDR基準測試中,包括DeepResearch Bench、DeepConsult和DeepResearchGym,建立了新的技術前沿。這些結果驗證了我們以人為本、迭代的方法論,表明適應性規劃與聚焦綜合對於生成高質量、可靠且結構良好的報告至關重要。
English
This paper tackles open-ended deep research (OEDR), a complex challenge where AI agents must synthesize vast web-scale information into insightful reports. Current approaches are plagued by dual-fold limitations: static research pipelines that decouple planning from evidence acquisition and one-shot generation paradigms that easily suffer from long-context failure issues like "loss in the middle" and hallucinations. To address these challenges, we introduce WebWeaver, a novel dual-agent framework that emulates the human research process. The planner operates in a dynamic cycle, iteratively interleaving evidence acquisition with outline optimization to produce a comprehensive, source-grounded outline linking to a memory bank of evidence. The writer then executes a hierarchical retrieval and writing process, composing the report section by section. By performing targeted retrieval of only the necessary evidence from the memory bank for each part, it effectively mitigates long-context issues. Our framework establishes a new state-of-the-art across major OEDR benchmarks, including DeepResearch Bench, DeepConsult, and DeepResearchGym. These results validate our human-centric, iterative methodology, demonstrating that adaptive planning and focused synthesis are crucial for producing high-quality, reliable, and well-structured reports.
PDF773September 17, 2025