MemOS:面向AI系统的内存操作系统
MemOS: A Memory OS for AI System
July 4, 2025
作者: Zhiyu Li, Shichao Song, Chenyang Xi, Hanyu Wang, Chen Tang, Simin Niu, Ding Chen, Jiawei Yang, Chunyu Li, Qingchen Yu, Jihao Zhao, Yezhaohui Wang, Peng Liu, Zehao Lin, Pengyuan Wang, Jiahao Huo, Tianyi Chen, Kai Chen, Kehang Li, Zhen Tao, Junpeng Ren, Huayi Lai, Hao Wu, Bo Tang, Zhenren Wang, Zhaoxin Fan, Ningyu Zhang, Linfeng Zhang, Junchi Yan, Mingchuan Yang, Tong Xu, Wei Xu, Huajun Chen, Haofeng Wang, Hongkang Yang, Wentao Zhang, Zhi-Qin John Xu, Siheng Chen, Feiyu Xiong
cs.AI
摘要
大型语言模型(LLMs)已成为实现人工通用智能(AGI)的关键基础设施,然而其缺乏明确的内存管理系统,阻碍了长上下文推理、持续个性化及知识一致性的发展。现有模型主要依赖静态参数和短暂的情境状态,限制了其追踪用户偏好或长期更新知识的能力。尽管检索增强生成(RAG)引入了外部纯文本知识,它仍是一种无状态的临时解决方案,缺乏生命周期控制或与持久化表示的整合。近期研究从内存层次结构视角建模了LLMs的训练与推理成本,表明在参数内存与外部检索之间引入显式内存层,可通过外部化特定知识显著降低这些成本。除了计算效率,LLMs还面临信息如何随时间与上下文分布带来的更广泛挑战,需要能够管理跨越不同时间尺度与来源的异构知识的系统。为应对这一挑战,我们提出了MemOS,一个将内存视为可管理系统资源的内存操作系统。它统一了纯文本、基于激活及参数级内存的表示、调度与进化,实现了成本高效的存储与检索。作为基本单元,MemCube封装了内存内容及如来源与版本控制等元数据。MemCube可随时间进行组合、迁移与融合,支持内存类型间的灵活转换,并桥接检索与基于参数的学习。MemOS建立了一个以内存为中心的系统框架,为LLMs带来了可控性、可塑性及可进化性,为持续学习与个性化建模奠定了基础。
English
Large Language Models (LLMs) have become an essential infrastructure for
Artificial General Intelligence (AGI), yet their lack of well-defined memory
management systems hinders the development of long-context reasoning, continual
personalization, and knowledge consistency.Existing models mainly rely on
static parameters and short-lived contextual states, limiting their ability to
track user preferences or update knowledge over extended periods.While
Retrieval-Augmented Generation (RAG) introduces external knowledge in plain
text, it remains a stateless workaround without lifecycle control or
integration with persistent representations.Recent work has modeled the
training and inference cost of LLMs from a memory hierarchy perspective,
showing that introducing an explicit memory layer between parameter memory and
external retrieval can substantially reduce these costs by externalizing
specific knowledge. Beyond computational efficiency, LLMs face broader
challenges arising from how information is distributed over time and context,
requiring systems capable of managing heterogeneous knowledge spanning
different temporal scales and sources. To address this challenge, we propose
MemOS, a memory operating system that treats memory as a manageable system
resource. It unifies the representation, scheduling, and evolution of
plaintext, activation-based, and parameter-level memories, enabling
cost-efficient storage and retrieval. As the basic unit, a MemCube encapsulates
both memory content and metadata such as provenance and versioning. MemCubes
can be composed, migrated, and fused over time, enabling flexible transitions
between memory types and bridging retrieval with parameter-based learning.
MemOS establishes a memory-centric system framework that brings
controllability, plasticity, and evolvability to LLMs, laying the foundation
for continual learning and personalized modeling.