InfoAgent：推动自主信息检索智能体的发展

摘要

构建能够通过与外部工具交互来扩展能力的大型语言模型代理，代表了人工智能研究和应用的新前沿。本文介绍了InfoAgent，这是一个由创新的数据合成流程和协调的网页搜索工具驱动的深度研究代理。为了构建具有挑战性且难以找到的查询，我们构建了实体树并应用子树采样与实体模糊化技术，系统地提升问题难度。与以往严重依赖商业搜索工具的工作不同，我们开发了专用的自托管搜索基础设施，增强了代理环境的透明度，并促进了代理能力的进一步提升。我们通过衡量正确回答问题所需的平均工具调用次数来评估数据管道的有效性，并展示了我们的代理在配备这些工具时表现更优。InfoAgent基于Qwen3-14B进行后训练，采用两阶段策略：冷启动监督微调以培养长期搜索行为，随后通过强化学习显著提升推理驱动的工具使用能力。采用我们的方法，InfoAgent在BrowseComp上达到15.3%的准确率，在BrowseComp-ZH上达到29.2%，在Xbench-DS上达到40.4%，超越了如WebSailor-72B和DeepDive-32B等先前的开源深度研究代理。

English

Building Large Language Model agents that expand their capabilities by interacting with external tools represents a new frontier in AI research and applications. In this paper, we introduce InfoAgent, a deep research agent powered by an innovative data synthesis pipeline and orchestrated web search tools. To construct challenging, hard-to-find queries,we build entity trees and apply sub-tree sampling with entity fuzzification to systematically increase question difficulty. Unlike prior work that relies heavily on commercial search tools, we develop a dedicated self-hosted search infrastructure, enhancing transparency of agent environments and facilitating further advancement of agent capacity. We evaluate the effectiveness of our data pipeline by measuring the average number of tool calls required to correctly answer a question, and also show that our agent yields better performance when equipped with our tools. Our InfoAgent is post-trained from Qwen3-14B using a two-stage recipe: cold-start supervised finetuning to instill long-horizon search behaviors, followed by reinforcement learning which significantly improves reasoning-driven tool use. With our methods, InfoAgent achieves 15.3\% accuracy on BrowseComp, 29.2\% on BrowseComp-ZH, and 40.4\% on Xbench-DS, outperforming prior open-source deep research agents such as WebSailor-72B and DeepDive-32B.

InfoAgent：推动自主信息检索智能体的发展

InfoAgent: Advancing Autonomous Information-Seeking Agents

摘要

Support