Memanto：面向长视野智能体的类型化语义记忆与信息论检索系统

摘要

从无状态语言模型推理向持久化、多会话自主智能体的转变过程中，内存已成为生产级智能体系统部署的主要架构瓶颈。现有方法主要依赖混合语义图架构，这种架构在信息录入和检索阶段都会产生巨大计算开销。这类系统通常需要大型语言模型介导的实体提取、显式图谱模式维护以及多查询检索流水线。本文提出Memanto——一种面向智能体人工智能的通用内存层，该设计对"必须通过复杂知识图谱才能实现高保真智能体记忆"的主流假设提出了挑战。Memanto集成了包含十三类预定义记忆单元的类型化语义记忆模式、自动化冲突解决机制及时间版本管理功能。这些组件通过Moorcheh信息论搜索引擎实现，该无索引语义数据库可在亚90毫秒延迟内实现确定性检索，同时消除数据录入延迟。基于LongMemEval和LoCoMo评估套件的系统性测试表明，Memanto分别实现了89.8%和87.1%的顶尖准确率，在仅需单次检索查询、零录入开销且保持显著更低运营复杂度的前提下，超越了所有已评估的混合图谱与向量系统。本文通过五阶段渐进式消融实验量化了各架构组件的贡献度，进而探讨了该设计对智能体记忆系统可扩展部署的启示。

English

The transition from stateless language model inference to persistent, multi session autonomous agents has revealed memory to be a primary architectural bottleneck in the deployment of production grade agentic systems. Existing methodologies largely depend on hybrid semantic graph architectures, which impose substantial computational overhead during both ingestion and retrieval. These systems typically require large language model mediated entity extraction, explicit graph schema maintenance, and multi query retrieval pipelines. This paper introduces Memanto, a universal memory layer for agentic artificial intelligence that challenges the prevailing assumption that knowledge graph complexity is necessary to achieve high fidelity agent memory. Memanto integrates a typed semantic memory schema comprising thirteen predefined memory categories, an automated conflict resolution mechanism, and temporal versioning. These components are enabled by Moorcheh's Information Theoretic Search engine, a no indexing semantic database that provides deterministic retrieval within sub ninety millisecond latency while eliminating ingestion delay. Through systematic benchmarking on the LongMemEval and LoCoMo evaluation suites, Memanto achieves state of the art accuracy scores of 89.8 percent and 87.1 percent respectively. These results surpass all evaluated hybrid graph and vector based systems while requiring only a single retrieval query, incurring no ingestion cost, and maintaining substantially lower operational complexity. A five stage progressive ablation study is presented to quantify the contribution of each architectural component, followed by a discussion of the implications for scalable deployment of agentic memory systems.