ChatPaper.aiChatPaper

人工智能海马体:我们距离人类记忆还有多远?

The AI Hippocampus: How Far are We From Human Memory?

January 14, 2026
作者: Zixia Jia, Jiaqi Li, Yipeng Kang, Yuxuan Wang, Tong Wu, Quansen Wang, Xiaobo Wang, Shuyi Zhang, Junzhe Shen, Qing Li, Siyuan Qi, Yitao Liang, Di He, Zilong Zheng, Song-Chun Zhu
cs.AI

摘要

記憶在增強現代大型語言模型與多模態大語言模型的推理能力、適應性及語境保真度方面發揮著基礎性作用。隨著這些模型從靜態預測器轉變為具備持續學習與個性化推理能力的交互系統,記憶機制的整合已成為其架構與功能演進的核心議題。本文對LLM與MLLM中的記憶研究進行了全面且結構化的綜述,將相關文獻整合為由隱性記憶、顯性記憶與能動性記憶範式構成的統一分類體系。具體而言,本研究闡釋了三種主要記憶框架:隱性記憶指預訓練轉換器內部參數中嵌入的知識,包括其記憶存儲、聯想檢索和語境推理能力,近期研究側重於解讀、操控與重構這種潛在記憶的方法;顯性記憶涉及外部存儲與檢索組件,通過文本語料庫、稠密向量和圖結構等可動態查詢的知識表徵來增強模型輸出,從而實現與信息源的可擴展、可更新的交互;能動性記憶在自主智能體中引入具時間延續性的持久記憶結構,促進多智能體系統中的長期規劃、自我一致性與協作行為,對具身人工智能與交互式AI具有重要意義。超越文本範疇,本文還探討了多模態場景下的記憶整合機制,其中視覺、語言、音頻與行動模態間的連貫性至關重要。文中系統討論了關鍵架構進展、基準任務與開放性挑戰,包括記憶容量、對齊機制、事實一致性及跨系統互操作性等核心問題。
English
Memory plays a foundational role in augmenting the reasoning, adaptability, and contextual fidelity of modern Large Language Models and Multi-Modal LLMs. As these models transition from static predictors to interactive systems capable of continual learning and personalized inference, the incorporation of memory mechanisms has emerged as a central theme in their architectural and functional evolution. This survey presents a comprehensive and structured synthesis of memory in LLMs and MLLMs, organizing the literature into a cohesive taxonomy comprising implicit, explicit, and agentic memory paradigms. Specifically, the survey delineates three primary memory frameworks. Implicit memory refers to the knowledge embedded within the internal parameters of pre-trained transformers, encompassing their capacity for memorization, associative retrieval, and contextual reasoning. Recent work has explored methods to interpret, manipulate, and reconfigure this latent memory. Explicit memory involves external storage and retrieval components designed to augment model outputs with dynamic, queryable knowledge representations, such as textual corpora, dense vectors, and graph-based structures, thereby enabling scalable and updatable interaction with information sources. Agentic memory introduces persistent, temporally extended memory structures within autonomous agents, facilitating long-term planning, self-consistency, and collaborative behavior in multi-agent systems, with relevance to embodied and interactive AI. Extending beyond text, the survey examines the integration of memory within multi-modal settings, where coherence across vision, language, audio, and action modalities is essential. Key architectural advances, benchmark tasks, and open challenges are discussed, including issues related to memory capacity, alignment, factual consistency, and cross-system interoperability.
PDF41January 16, 2026