ChatPaper.aiChatPaper

LightMem:轻量级高效记忆增强生成

LightMem: Lightweight and Efficient Memory-Augmented Generation

October 21, 2025
作者: Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu, Shumin Deng, Yunzhi Yao, Mengru Wang, Shuofei Qiao, Huajun Chen, Ningyu Zhang
cs.AI

摘要

尽管大型语言模型(LLMs)展现出卓越的能力,但在动态复杂环境中有效利用历史交互信息方面仍面临挑战。记忆系统通过引入持久的信息存储、检索和利用机制,使LLMs能够超越无状态交互。然而,现有的记忆系统往往带来显著的时间和计算开销。为此,我们提出了一种名为LightMem的新型记忆系统,它在记忆系统的性能与效率之间实现了平衡。受人类记忆的Atkinson-Shiffrin模型启发,LightMem将记忆组织为三个互补阶段。首先,认知启发的感官记忆通过轻量级压缩快速过滤无关信息,并按主题对信息进行分组。接着,主题感知的短期记忆整合这些基于主题的组别,组织和总结内容以实现更结构化的访问。最后,采用睡眠时间更新的长期记忆通过离线过程将整合与在线推理解耦。在LongMemEval上使用GPT和Qwen骨干进行的实验表明,LightMem在准确率上(最高提升10.9%)优于强基线,同时将令牌使用量减少高达117倍,API调用减少高达159倍,运行时间缩短超过12倍。代码可在https://github.com/zjunlp/LightMem获取。
English
Despite their remarkable capabilities, Large Language Models (LLMs) struggle to effectively leverage historical interaction information in dynamic and complex environments. Memory systems enable LLMs to move beyond stateless interactions by introducing persistent information storage, retrieval, and utilization mechanisms. However, existing memory systems often introduce substantial time and computational overhead. To this end, we introduce a new memory system called LightMem, which strikes a balance between the performance and efficiency of memory systems. Inspired by the Atkinson-Shiffrin model of human memory, LightMem organizes memory into three complementary stages. First, cognition-inspired sensory memory rapidly filters irrelevant information through lightweight compression and groups information according to their topics. Next, topic-aware short-term memory consolidates these topic-based groups, organizing and summarizing content for more structured access. Finally, long-term memory with sleep-time update employs an offline procedure that decouples consolidation from online inference. Experiments on LongMemEval with GPT and Qwen backbones show that LightMem outperforms strong baselines in accuracy (up to 10.9% gains) while reducing token usage by up to 117x, API calls by up to 159x, and runtime by over 12x. The code is available at https://github.com/zjunlp/LightMem.
PDF913October 22, 2025