HierSearch:集成本地与网络搜索的层次化企业深度搜索框架
HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches
August 11, 2025
作者: Jiejun Tan, Zhicheng Dou, Yan Yu, Jiehan Cheng, Qiang Ju, Jian Xie, Ji-Rong Wen
cs.AI
摘要
近期,大型推理模型在数学与编程能力上展现出强大实力,而深度搜索则利用其推理能力应对复杂的信息检索任务。现有的深度搜索工作通常局限于单一知识源,无论是本地还是网络。然而,企业往往需要能够同时利用本地和网络语料库搜索工具的私有深度搜索系统。简单地通过平面强化学习(RL)训练一个配备多种搜索工具的代理,虽是一个直观的想法,却存在训练数据效率低下及对复杂工具掌握不足等问题。为解决上述问题,我们提出了一种采用分层RL训练的层次化代理深度搜索框架——HierSearch。在底层,分别训练一个本地深度搜索代理和一个网络深度搜索代理,以从各自领域检索证据。在高层,一个规划代理协调底层代理并提供最终答案。此外,为防止直接复制答案及错误传播,我们设计了一个知识精炼器,用于过滤掉底层代理返回的幻觉及无关证据。实验表明,与平面RL相比,HierSearch在性能上表现更优,并在涵盖通用、金融及医疗领域的六个基准测试中,超越了多种深度搜索及多源检索增强生成基线方法。
English
Recently, large reasoning models have demonstrated strong mathematical and
coding abilities, and deep search leverages their reasoning capabilities in
challenging information retrieval tasks. Existing deep search works are
generally limited to a single knowledge source, either local or the Web.
However, enterprises often require private deep search systems that can
leverage search tools over both local and the Web corpus. Simply training an
agent equipped with multiple search tools using flat reinforcement learning
(RL) is a straightforward idea, but it has problems such as low training data
efficiency and poor mastery of complex tools. To address the above issue, we
propose a hierarchical agentic deep search framework, HierSearch, trained with
hierarchical RL. At the low level, a local deep search agent and a Web deep
search agent are trained to retrieve evidence from their corresponding domains.
At the high level, a planner agent coordinates low-level agents and provides
the final answer. Moreover, to prevent direct answer copying and error
propagation, we design a knowledge refiner that filters out hallucinations and
irrelevant evidence returned by low-level agents. Experiments show that
HierSearch achieves better performance compared to flat RL, and outperforms
various deep search and multi-source retrieval-augmented generation baselines
in six benchmarks across general, finance, and medical domains.