ChatPaper.aiChatPaper

人工智能海马体:我们离人类记忆还有多远?

The AI Hippocampus: How Far are We From Human Memory?

January 14, 2026
作者: Zixia Jia, Jiaqi Li, Yipeng Kang, Yuxuan Wang, Tong Wu, Quansen Wang, Xiaobo Wang, Shuyi Zhang, Junzhe Shen, Qing Li, Siyuan Qi, Yitao Liang, Di He, Zilong Zheng, Song-Chun Zhu
cs.AI

摘要

记忆在现代大语言模型及多模态大语言模型中发挥着增强推理能力、适应性与语境保真度的基础性作用。随着这些模型从静态预测器转变为具备持续学习与个性化推理能力的交互系统,记忆机制的融入已成为其架构与功能演进的核心议题。本文对LLM与MLLM中的记忆研究进行了系统化梳理,将现有文献整合为隐式记忆、显式记忆与智能体记忆三大范式构成的统一分类体系。具体而言,本研究界定出三种核心记忆框架:隐式记忆指预训练Transformer内部参数所蕴含的知识储备,包括其记忆存储、关联检索与语境推理能力,近期研究聚焦于对这种潜在记忆的解释、操控与重构方法;显式记忆通过外部存储与检索模块增强模型输出,采用文本语料库、稠密向量及图结构等动态可查询的知识表征,实现与信息源的可扩展、可更新的交互;智能体记忆在自主智能体中构建具有时间延续性的持久记忆结构,支持多智能体系统中的长期规划、自我一致性与协同行为,与具身交互AI密切相关。超越文本范畴,本文还考察了多模态场景下的记忆整合机制,其中视觉、语言、音频与行动模态间的连贯性至关重要。文中重点讨论了关键架构进展、基准任务与开放挑战,包括记忆容量、对齐机制、事实一致性及跨系统互操作性等议题。
English
Memory plays a foundational role in augmenting the reasoning, adaptability, and contextual fidelity of modern Large Language Models and Multi-Modal LLMs. As these models transition from static predictors to interactive systems capable of continual learning and personalized inference, the incorporation of memory mechanisms has emerged as a central theme in their architectural and functional evolution. This survey presents a comprehensive and structured synthesis of memory in LLMs and MLLMs, organizing the literature into a cohesive taxonomy comprising implicit, explicit, and agentic memory paradigms. Specifically, the survey delineates three primary memory frameworks. Implicit memory refers to the knowledge embedded within the internal parameters of pre-trained transformers, encompassing their capacity for memorization, associative retrieval, and contextual reasoning. Recent work has explored methods to interpret, manipulate, and reconfigure this latent memory. Explicit memory involves external storage and retrieval components designed to augment model outputs with dynamic, queryable knowledge representations, such as textual corpora, dense vectors, and graph-based structures, thereby enabling scalable and updatable interaction with information sources. Agentic memory introduces persistent, temporally extended memory structures within autonomous agents, facilitating long-term planning, self-consistency, and collaborative behavior in multi-agent systems, with relevance to embodied and interactive AI. Extending beyond text, the survey examines the integration of memory within multi-modal settings, where coherence across vision, language, audio, and action modalities is essential. Key architectural advances, benchmark tasks, and open challenges are discussed, including issues related to memory capacity, alignment, factual consistency, and cross-system interoperability.
PDF41January 16, 2026