SAGE：基於執行反饋的深度搜索可導向智能數據生成

摘要

深度搜索智能体的目标是通过跨多文档推理来回答复杂问题，能显著加速信息检索过程。但由于需要漫长而复杂的探索路径，为此类应用采集人工标注的成本极其高昂。我们提出一种智能流程，能针对给定语料库和目标难度等级，自动生成高质量且难度可控的深度搜索问答对。该流程SAGE包含两个组件：提出问答对的数据生成器，以及尝试解答生成问题并为数据生成器提供执行反馈的搜索智能体。两个组件通过多轮交互迭代优化问答对，直至满足目标难度要求。内在评估表明，SAGE生成的问题需要多样化推理策略，同时显著提升了生成数据的正确性与难度。外在评估显示，使用我们的合成数据训练深度搜索智能体后，在主流深度搜索基准测试中可获得最高23%的相对性能提升。补充实验证明，基于我们数据训练的智能体在推理时无需额外训练，即可实现从固定语料检索到谷歌搜索的适配转换。

English

Deep search agents, which aim to answer complex questions requiring reasoning across multiple documents, can significantly speed up the information-seeking process. Collecting human annotations for this application is prohibitively expensive due to long and complex exploration trajectories. We propose an agentic pipeline that automatically generates high quality, difficulty-controlled deep search question-answer pairs for a given corpus and a target difficulty level. Our pipeline, SAGE, consists of a data generator which proposes QA pairs and a search agent which attempts to solve the generated question and provide execution feedback for the data generator. The two components interact over multiple rounds to iteratively refine the question-answer pairs until they satisfy the target difficulty level. Our intrinsic evaluation shows SAGE generates questions that require diverse reasoning strategies, while significantly increases the correctness and difficulty of the generated data. Our extrinsic evaluation demonstrates up to 23% relative performance gain on popular deep search benchmarks by training deep search agents with our synthetic data. Additional experiments show that agents trained on our data can adapt from fixed-corpus retrieval to Google Search at inference time, without further training.

SAGE：基於執行反饋的深度搜索可導向智能數據生成

SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback

摘要

Support