屏蔽过时观测有助于搜索智能体——直至失效:一个机制图及其机理
Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism
May 29, 2026
作者: Haoxiang Zhang, Qixin Xu, Zhuofeng Li, Lei Zhang, Pengcheng Jiang, Yu Zhang, Julian McAuley
cs.AI
摘要
长视野搜索代理在多次工具调用中会积累大量检索内容,这使得上下文预算效率变得愈发重要。一种最小干预措施是在轨迹推进过程中从上下文中掩码过时的观测,但目前尚不清楚这种上下文管理方式何时有效及其原因。我们通过系统性地梳理不同代理骨干网络(4B至284B参数)及三种检索器在离线与实时网页代理搜索基准上的表现,对观测掩码进行了研究。研究发现,当以无上下文管理时的模型准确率为横轴时,掩码带来的准确率提升呈非对称倒U形:在弱检索器条件下出现平台期,当强检索器与中等容量模型相遇时达到峰值,而在模型饱和时则急剧下降。这一模式反映了检索器召回率与模型隐式过滤能力之间的相互作用,而非单一因素的独立影响。从机制上看,掩码实现了"令牌换轮次"的权衡:它移除了模型已基本停止关注的观测,以及代理极少重新打开的页面。新增加的轮次在能够将失败转化为成功时发挥作用,但当掩码移除了模型本可使用的证据时则会导致失败。因此,我们将上下文管理重新定义为一种基于能力区间的干预措施,并为分析代理深度搜索中的上下文使用提供了整体视角。我们在此发布了相关框架及轨迹数据(https://github.com/i-DeepSearch/observation-masking),以支持未来研究。
English
Long-horizon search agents accumulate large amounts of retrieved content across many tool calls, making context-budget efficiency increasingly important. A minimal intervention is to mask stale observations from the context as the trajectory progresses, but it remains unclear when this form of context management helps and why. We study observation masking through a systematic sweep over various agent backbones (4B to 284B parameters) and three retrievers on offline and live-web agentic search benchmarks. We find that the accuracy gain from masking follows an asymmetric inverted-U shape when plotted against the model's accuracy without context management: a plateau under weak retrievers, a peak when a strong retriever meets a mid-capacity model, and a sharp collapse when the model is saturated. This pattern reflects the interaction between retriever recall and the model's implicit filtering capacity, rather than either factor in isolation. Mechanistically, masking implements a token-for-turn trade-off: it removes observations the model has largely stopped attending to and pages the agent rarely re-opens. The added turns help when they convert failures into successes, but they fail when masking removes evidence the model would otherwise have used. We therefore reframe context management as a regime-dependent intervention and provide a holistic perspective for analyzing context use in agentic deep search. We release our scaffold and trajectories here (https://github.com/i-DeepSearch/observation-masking) to support future research.