MemOS：面向AI系统的内存操作系统

摘要

大型语言模型（LLMs）已成为实现人工通用智能（AGI）的关键基础设施，然而其缺乏明确的内存管理系统，阻碍了长上下文推理、持续个性化及知识一致性的发展。现有模型主要依赖静态参数和短暂的情境状态，限制了其追踪用户偏好或长期更新知识的能力。尽管检索增强生成（RAG）引入了外部纯文本知识，它仍是一种无状态的临时解决方案，缺乏生命周期控制或与持久化表示的整合。近期研究从内存层次结构视角建模了LLMs的训练与推理成本，表明在参数内存与外部检索之间引入显式内存层，可通过外部化特定知识显著降低这些成本。除了计算效率，LLMs还面临信息如何随时间与上下文分布带来的更广泛挑战，需要能够管理跨越不同时间尺度与来源的异构知识的系统。为应对这一挑战，我们提出了MemOS，一个将内存视为可管理系统资源的内存操作系统。它统一了纯文本、基于激活及参数级内存的表示、调度与进化，实现了成本高效的存储与检索。作为基本单元，MemCube封装了内存内容及如来源与版本控制等元数据。MemCube可随时间进行组合、迁移与融合，支持内存类型间的灵活转换，并桥接检索与基于参数的学习。MemOS建立了一个以内存为中心的系统框架，为LLMs带来了可控性、可塑性及可进化性，为持续学习与个性化建模奠定了基础。

English

Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI), yet their lack of well-defined memory management systems hinders the development of long-context reasoning, continual personalization, and knowledge consistency.Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.While Retrieval-Augmented Generation (RAG) introduces external knowledge in plain text, it remains a stateless workaround without lifecycle control or integration with persistent representations.Recent work has modeled the training and inference cost of LLMs from a memory hierarchy perspective, showing that introducing an explicit memory layer between parameter memory and external retrieval can substantially reduce these costs by externalizing specific knowledge. Beyond computational efficiency, LLMs face broader challenges arising from how information is distributed over time and context, requiring systems capable of managing heterogeneous knowledge spanning different temporal scales and sources. To address this challenge, we propose MemOS, a memory operating system that treats memory as a manageable system resource. It unifies the representation, scheduling, and evolution of plaintext, activation-based, and parameter-level memories, enabling cost-efficient storage and retrieval. As the basic unit, a MemCube encapsulates both memory content and metadata such as provenance and versioning. MemCubes can be composed, migrated, and fused over time, enabling flexible transitions between memory types and bridging retrieval with parameter-based learning. MemOS establishes a memory-centric system framework that brings controllability, plasticity, and evolvability to LLMs, laying the foundation for continual learning and personalized modeling.

MemOS：面向AI系统的内存操作系统

MemOS: A Memory OS for AI System

摘要

Support