ReSum:透過上下文摘要解鎖長時程搜索智能
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization
September 16, 2025
作者: Xixi Wu, Kuan Li, Yida Zhao, Liwen Zhang, Litu Ou, Huifeng Yin, Zhongwang Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Minhao Cheng, Shuai Wang, Hong Cheng, Jingren Zhou
cs.AI
摘要
基於大型語言模型(LLM)的網絡代理在知識密集型任務上展現出強勁性能,但在如ReAct等範式中,其表現受到上下文窗口限制的阻礙。涉及多個實體、錯綜複雜關係及高度不確定性的複雜查詢,需要進行廣泛的搜索循環,這在達成完整解決方案之前便迅速耗盡了上下文預算。為克服這一挑戰,我們引入了ReSum,這是一種通過定期上下文摘要實現無限探索的新範式。ReSum將不斷增長的交互歷史轉化為緊湊的推理狀態,既保持對先前發現的認知,又繞過了上下文限制。針對範式適應,我們提出了ReSum-GRPO,它將GRPO與分段軌跡訓練及優勢廣播相結合,使代理熟悉基於摘要的推理。在三個基準測試中對不同規模的網絡代理進行的廣泛實驗表明,ReSum相較於ReAct實現了平均4.5%的絕對提升,而經過ReSum-GRPO訓練後,提升幅度更可達8.2%。值得注意的是,僅憑1K訓練樣本,我們的WebResummer-30B(WebSailor-30B的ReSum-GRPO訓練版本)在BrowseComp-zh上達到了33.3%的Pass@1,在BrowseComp-en上達到了18.3%,超越了現有的開源網絡代理。
English
Large Language Model (LLM)-based web agents demonstrate strong performance on
knowledge-intensive tasks but are hindered by context window limitations in
paradigms like ReAct. Complex queries involving multiple entities, intertwined
relationships, and high uncertainty demand extensive search cycles that rapidly
exhaust context budgets before reaching complete solutions. To overcome this
challenge, we introduce ReSum, a novel paradigm that enables indefinite
exploration through periodic context summarization. ReSum converts growing
interaction histories into compact reasoning states, maintaining awareness of
prior discoveries while bypassing context constraints. For paradigm adaptation,
we propose ReSum-GRPO, integrating GRPO with segmented trajectory training and
advantage broadcasting to familiarize agents with summary-conditioned
reasoning. Extensive experiments on web agents of varying scales across three
benchmarks demonstrate that ReSum delivers an average absolute improvement of
4.5\% over ReAct, with further gains of up to 8.2\% following ReSum-GRPO
training. Notably, with only 1K training samples, our WebResummer-30B (a
ReSum-GRPO-trained version of WebSailor-30B) achieves 33.3\% Pass@1 on
BrowseComp-zh and 18.3\% on BrowseComp-en, surpassing existing open-source web
agents.