Web2BigTable:一种面向互联网规模信息搜索与提取的双层多智能体大语言模型系统
Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction
April 29, 2026
作者: Yuxuan Huang, Yihang Chen, Zhiyuan He, Yuxiang Chen, Ka Yiu Lee, Huichi Zhou, Weilin Luo, Meng Fang, Jun Wang
cs.AI
摘要
当前,智能网络搜索日益面临两大差异化需求:针对单一目标的深度推理,以及跨多实体与异构源的结构化聚合。现有系统在这两方面均存在不足。广度导向型任务要求输出结果具备模式对齐的广泛覆盖性和跨实体一致性,而深度导向型任务则需要对长链条、多分支的搜索轨迹进行连贯推理。我们提出Web2BigTable——一个支持双模式运行的网络到表格搜索多智能体框架。该框架采用双层架构:上层编排器将任务分解为子问题,下层工作智能体并行求解。通过“执行-验证-反思”的闭环流程,框架借助持久化、人类可读的外部记忆,持续优化任务分解与执行策略,并实现单智能体的自我演进。执行过程中,工作智能体通过共享工作区进行协调,使部分发现结果可视化,从而减少重复探索、调和矛盾证据,并动态适应覆盖缺口。Web2BigTable在WideSearch基准上取得突破性表现:Avg@4成功率达38.50(是第二名5.10的7.5倍),行级F1值63.53(较第二名提升25.03),项级F1值80.12(较第二名提升14.42)。在XBench-DeepSearch的深度导向型搜索任务中也展现出色泛化能力,准确率达73.0%。代码已开源:https://github.com/web2bigtable/web2bigtable。
English
Agentic web search increasingly faces two distinct demands: deep reasoning over a single target, and structured aggregation across many entities and heterogeneous sources. Current systems struggle on both fronts. Breadth-oriented tasks demand schema-aligned outputs with wide coverage and cross-entity consistency, while depth-oriented tasks require coherent reasoning over long, branching search trajectories. We introduce Web2BigTable, a multi-agent framework for web-to-table search that supports both regimes. Web2BigTable adopts a bi-level architecture in which an upper-level orchestrator decomposes the task into sub-problems and lower-level worker agents solve them in parallel. Through a closed-loop run--verify--reflect process, the framework jointly improves decomposition and execution over time via persistent, human-readable external memory, with self-evolving updates to each single-agent. During execution, workers coordinate through a shared workspace that makes partial findings visible, allowing them to reduce redundant exploration, reconcile conflicting evidence, and adapt to emerging coverage gaps. Web2BigTable sets a new state of the art on WideSearch, reaching an Avg@4 Success Rate of 38.50 (7.5times the second best at 5.10), Row F1 of 63.53 (+25.03 over the second best), and Item F1 of 80.12 (+14.42 over the second best). It also generalises to depth-oriented search on XBench-DeepSearch, achieving 73.0 accuracy. Code is available at https://github.com/web2bigtable/web2bigtable.