ChatPaper.aiChatPaper

δ-mem:面向大型语言模型的高效在线记忆

δ-mem: Efficient Online Memory for Large Language Models

May 12, 2026
作者: Jingdi Lei, Di Zhang, Junxian Li, Weida Wang, Kaixuan Fan, Xiang Liu, Qihan Liu, Xiaoteng Ma, Baian Chen, Soujanya Poria
cs.AI

摘要

大型语言模型在长期助手和智能体系统中,日益需要积累和复用历史信息。单纯扩展上下文窗口不仅成本高昂,而且往往难以确保上下文的有效利用。我们提出δ-mem,这是一种轻量级记忆机制,通过紧凑的联想记忆在线状态来增强冻结的全注意力主干网络。δ-mem将过往信息压缩为通过delta规则学习更新的固定大小状态矩阵,并在生成过程中利用其读出结果生成注意力计算的低秩修正。仅使用8×8的在线记忆状态,δ-mem的平均得分就达到了冻结主干网络的1.10倍,以及最强非δ-mem记忆基线的1.15倍。在重度依赖记忆的基准测试中,它取得了更大的提升——在MemoryAgentBench上达到1.31倍,在LoCoMo上达到1.20倍,同时很大程度上保留了通用能力。这些结果表明,通过一个紧凑的在线状态直接与注意力计算耦合,无需全微调、主干替换或显式上下文扩展,即可实现有效的记忆。
English
Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose δ-mem, a lightweight memory mechanism that augments a frozen full-attention backbone with a compact online state of associative memory. δ-mem compresses past information into a fixed-size state matrix updated by delta-rule learning, and uses its readout to generate low-rank corrections to the backbone's attention computation during generation. With only an 8times8 online memory state, δ-mem improves the average score to 1.10times that of the frozen backbone and 1.15times that of the strongest non-δ-mem memory baseline. It achieves larger gains on memory-heavy benchmarks, reaching 1.31times on MemoryAgentBench and 1.20times on LoCoMo, while largely preserving general capabilities. These results show that effective memory can be realized through a compact online state directly coupled with attention computation, without full fine-tuning, backbone replacement, or explicit context extension.
PDF903May 14, 2026