ChatPaper.aiChatPaper

δ-mem:大型語言模型的高效在線記憶

δ-mem: Efficient Online Memory for Large Language Models

May 12, 2026
作者: Jingdi Lei, Di Zhang, Junxian Li, Weida Wang, Kaixuan Fan, Xiang Liu, Qihan Liu, Xiaoteng Ma, Baian Chen, Soujanya Poria
cs.AI

摘要

大型語言模型在長期助手與代理系統中,日益需要累積並重複利用歷史資訊。單純擴展上下文窗口不僅成本高昂,也常無法確保有效的上下文運用。我們提出 δ-mem,這是一種輕量級記憶機制,透過一個緊湊的關聯記憶在線狀態,來強化已凍結的全注意力主幹。δ-mem 將過去資訊壓縮為固定大小的狀態矩陣,並以 delta 規則學習進行更新;在生成過程中,它利用讀出結果對主幹的注意力計算產生低秩修正。僅需 8×8 的在線記憶狀態,δ-mem 即能將平均分數提升至凍結主幹的 1.10 倍,以及最強非 δ-mem 記憶基線的 1.15 倍。在記憶密集型基準上,δ-mem 獲得更顯著增益,於 MemoryAgentBench 達到 1.31 倍,於 LoCoMo 達到 1.20 倍,同時大致保留了一般能力。這些結果表明,無需完全微調、更換主幹或顯式擴展上下文,透過一個直接與注意力計算耦合的緊湊在線狀態,即可實現有效的記憶。
English
Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose δ-mem, a lightweight memory mechanism that augments a frozen full-attention backbone with a compact online state of associative memory. δ-mem compresses past information into a fixed-size state matrix updated by delta-rule learning, and uses its readout to generate low-rank corrections to the backbone's attention computation during generation. With only an 8times8 online memory state, δ-mem improves the average score to 1.10times that of the frozen backbone and 1.15times that of the strongest non-δ-mem memory baseline. It achieves larger gains on memory-heavy benchmarks, reaching 1.31times on MemoryAgentBench and 1.20times on LoCoMo, while largely preserving general capabilities. These results show that effective memory can be realized through a compact online state directly coupled with attention computation, without full fine-tuning, backbone replacement, or explicit context extension.
PDF903May 14, 2026