记忆至关重要:以事件为核心的记忆逻辑图谱助力智能体搜索与推理
Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searching and Reasoning
January 8, 2026
作者: Yuyang Hu, Jiongnan Liu, Jiejun Tan, Yutao Zhu, Zhicheng Dou
cs.AI
摘要
大型语言模型(LLMs)正日益作为能够推理、规划并与环境交互的智能体被部署。为有效适应长周期场景,此类智能体的关键能力在于具备可留存、组织并检索过往经验以支持下游决策的记忆机制。然而现有方法大多以扁平化方式组织存储记忆,并依赖简单的基于相似性的检索技术。即使引入结构化记忆,现有方法仍难以明确捕捉经验或记忆单元间的逻辑关系。此外,记忆访问过程大多与构建的结构脱节,仍依赖于浅层语义检索,导致智能体无法对长周期依赖关系进行逻辑推理。本研究受事件分割理论启发,提出以事件为中心的记忆框架CompassMem。该框架通过将经验渐进分割为事件并以显式逻辑关系链接,构建起事件图谱形式的记忆结构。该图谱作为逻辑地图,使智能体能够超越表层检索,在记忆空间中进行结构化、目标导向的导航,逐步收集有价值记忆以支持长周期推理。在LoCoMo和NarrativeQA数据集上的实验表明,CompassMem在多种骨干模型中持续提升了检索与推理性能。
English
Large language models (LLMs) are increasingly deployed as intelligent agents that reason, plan, and interact with their environments. To effectively scale to long-horizon scenarios, a key capability for such agents is a memory mechanism that can retain, organize, and retrieve past experiences to support downstream decision-making. However, most existing approaches organize and store memories in a flat manner and rely on simple similarity-based retrieval techniques. Even when structured memory is introduced, existing methods often struggle to explicitly capture the logical relationships among experiences or memory units. Moreover, memory access is largely detached from the constructed structure and still depends on shallow semantic retrieval, preventing agents from reasoning logically over long-horizon dependencies. In this work, we propose CompassMem, an event-centric memory framework inspired by Event Segmentation Theory. CompassMem organizes memory as an Event Graph by incrementally segmenting experiences into events and linking them through explicit logical relations. This graph serves as a logic map, enabling agents to perform structured and goal-directed navigation over memory beyond superficial retrieval, progressively gathering valuable memories to support long-horizon reasoning. Experiments on LoCoMo and NarrativeQA demonstrate that CompassMem consistently improves both retrieval and reasoning performance across multiple backbone models.