**SimpleMem:面向大型語言模型代理的高效終身記憶系統** (注:標題採用技術術語直譯+功能說明的複合結構,其中"Lifelong Memory"採用學界通用譯法「終身記憶」,「Efficient」譯為「高效」體現性能特點,整體保持學術標題的簡潔性與技術準確性)
SimpleMem: Efficient Lifelong Memory for LLM Agents
January 5, 2026
作者: Jiaqi Liu, Yaofeng Su, Peng Xia, Siwei Han, Zeyu Zheng, Cihang Xie, Mingyu Ding, Huaxiu Yao
cs.AI
摘要
為支援複雜環境中可靠的長期互動,大型語言模型智能體需要能高效管理歷史經驗的記憶系統。現有方法要么通過被動擴展上下文來保留完整互動歷史,導致大量冗餘;要么依賴迭代推理進行噪聲過濾,產生高昂的標記成本。為解決這一難題,我們提出基於語義無損壓縮的高效記憶框架SimpleMem。我們設計了三階段流程以最大化信息密度與標記利用率:(1)語義結構化壓縮:通過熵感知過濾將非結構化互動提煉為緊湊的多視角索引記憶單元;(2)遞歸記憶鞏固:異步整合相關單元形成更高層級抽象表徵以降低冗餘;(3)自適應查詢感知檢索:根據查詢複雜度動態調整檢索範圍,高效構建精準上下文。在基準數據集上的實驗表明,本方法在準確率、檢索效率和推理成本上均穩定超越基線方法,平均F1值提升26.4%,同時將推理時標記消耗降低達30倍,展現出性能與效率的卓越平衡。程式碼已開源於:https://github.com/aiming-lab/SimpleMem。
English
To support reliable long-term interaction in complex environments, LLM agents require memory systems that efficiently manage historical experiences. Existing approaches either retain full interaction histories via passive context extension, leading to substantial redundancy, or rely on iterative reasoning to filter noise, incurring high token costs. To address this challenge, we introduce SimpleMem, an efficient memory framework based on semantic lossless compression. We propose a three-stage pipeline designed to maximize information density and token utilization: (1) Semantic Structured Compression, which applies entropy-aware filtering to distill unstructured interactions into compact, multi-view indexed memory units; (2) Recursive Memory Consolidation, an asynchronous process that integrates related units into higher-level abstract representations to reduce redundancy; and (3) Adaptive Query-Aware Retrieval, which dynamically adjusts retrieval scope based on query complexity to construct precise context efficiently. Experiments on benchmark datasets show that our method consistently outperforms baseline approaches in accuracy, retrieval efficiency, and inference cost, achieving an average F1 improvement of 26.4% while reducing inference-time token consumption by up to 30-fold, demonstrating a superior balance between performance and efficiency. Code is available at https://github.com/aiming-lab/SimpleMem.