**SimpleMem:面向LLM智能体的高效终身记忆系统** (注:此处采用"智能体"对应"Agents","终身记忆"对应"Lifelong Memory,既保持学术术语的准确性,又符合中文表达习惯。标题通过冒号分隔主副标题,延续原文简洁有力的风格。)
SimpleMem: Efficient Lifelong Memory for LLM Agents
January 5, 2026
作者: Jiaqi Liu, Yaofeng Su, Peng Xia, Siwei Han, Zeyu Zheng, Cihang Xie, Mingyu Ding, Huaxiu Yao
cs.AI
摘要
为支持复杂环境下的可靠长期交互,大语言模型智能体需要能高效管理历史经验的记忆系统。现有方法要么通过被动扩展上下文保留完整交互历史导致显著冗余,要么依赖迭代推理过滤噪声而产生高昂的令牌成本。为解决这一挑战,我们提出SimpleMem——基于语义无损压缩的高效记忆框架。我们设计了三阶段处理流程以最大化信息密度和令牌利用率:(1)语义结构化压缩:通过熵感知过滤将非结构化交互提炼为紧凑的多视角索引记忆单元;(2)递归记忆巩固:异步整合相关单元形成更高层抽象表征以降低冗余;(3)自适应查询感知检索:根据查询复杂度动态调整检索范围,高效构建精准上下文。在基准数据集上的实验表明,本方法在准确率、检索效率和推理成本方面持续超越基线方法,F1值平均提升26.4%,同时将推理时令牌消耗降低高达30倍,展现出性能与效率的卓越平衡。代码已开源:https://github.com/aiming-lab/SimpleMem。
English
To support reliable long-term interaction in complex environments, LLM agents require memory systems that efficiently manage historical experiences. Existing approaches either retain full interaction histories via passive context extension, leading to substantial redundancy, or rely on iterative reasoning to filter noise, incurring high token costs. To address this challenge, we introduce SimpleMem, an efficient memory framework based on semantic lossless compression. We propose a three-stage pipeline designed to maximize information density and token utilization: (1) Semantic Structured Compression, which applies entropy-aware filtering to distill unstructured interactions into compact, multi-view indexed memory units; (2) Recursive Memory Consolidation, an asynchronous process that integrates related units into higher-level abstract representations to reduce redundancy; and (3) Adaptive Query-Aware Retrieval, which dynamically adjusts retrieval scope based on query complexity to construct precise context efficiently. Experiments on benchmark datasets show that our method consistently outperforms baseline approaches in accuracy, retrieval efficiency, and inference cost, achieving an average F1 improvement of 26.4% while reducing inference-time token consumption by up to 30-fold, demonstrating a superior balance between performance and efficiency. Code is available at https://github.com/aiming-lab/SimpleMem.