ChatPaper.aiChatPaper

何時記憶與何時停止:用於長上下文推理的門控循環記憶

When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

February 11, 2026
作者: Leheng Sheng, Yongtao Zhang, Wenchang Ma, Yaorui Shi, Ting Huang, Xiang Wang, An Zhang, Ke Shen, Tat-Seng Chua
cs.AI

摘要

儘管長文本推理在各種現實應用中至關重要,但對大型語言模型而言仍具挑戰性,因為其性能會隨上下文長度增加而下降。近期研究MemAgent嘗試透過類RNN循環逐塊處理上下文,並更新文本記憶體以進行最終回答。然而,這種簡單的循環記憶更新存在兩個關鍵缺陷:(i)記憶體可能快速膨脹,因為其更新缺乏選擇性,甚至對無證據支持的文本塊也會更新;(ii)循環缺乏退出機制,導致在收集到足夠證據後仍進行不必要的計算。為解決這些問題,我們提出GRU-Mem,引入兩個文本控制門來實現更穩定高效的長文本推理。具體而言,GRU-Mem僅在更新門開啟時更新記憶體,且循環會在意圖退出門開啟時立即終止。為賦予模型此能力,我們在端到端強化學習中引入兩種獎勵信號r^{update}和r^{exit},分別獎勵正確的更新與退出行為。在多項長文本推理任務上的實驗表明,GRU-Mem能有效提升效能與效率,不僅普遍優化基礎MemAgent性能,更實現最高達400%的推理加速。
English
While reasoning over long context is crucial for various real-world applications, it remains challenging for large language models (LLMs) as they suffer from performance degradation as the context length grows. Recent work MemAgent has tried to tackle this by processing context chunk-by-chunk in an RNN-like loop and updating a textual memory for final answering. However, this naive recurrent memory update faces two crucial drawbacks: (i) memory can quickly explode because it can update indiscriminately, even on evidence-free chunks; and (ii) the loop lacks an exit mechanism, leading to unnecessary computation after even sufficient evidence is collected. To address these issues, we propose GRU-Mem, which incorporates two text-controlled gates for more stable and efficient long-context reasoning. Specifically, in GRU-Mem, the memory only updates when the update gate is open and the recurrent loop will exit immediately once the exit gate is open. To endow the model with such capabilities, we introduce two reward signals r^{update} and r^{exit} within end-to-end RL, rewarding the correct updating and exiting behaviors respectively. Experiments on various long-context reasoning tasks demonstrate the effectiveness and efficiency of GRU-Mem, which generally outperforms the vanilla MemAgent with up to 400\% times inference speed acceleration.
PDF230February 13, 2026