ChatPaper.aiChatPaper

MIRIX:基於大型語言模型的多智能體記憶系統

MIRIX: Multi-Agent Memory System for LLM-Based Agents

July 10, 2025
作者: Yu Wang, Xi Chen
cs.AI

摘要

儘管AI代理的記憶能力日益受到關注,現有解決方案仍存在根本性限制。大多數依賴於扁平、範圍狹窄的記憶組件,限制了其個性化、抽象化以及長期可靠地回憶用戶特定信息的能力。為此,我們推出了MIRIX,這是一個模塊化、多代理的記憶系統,通過解決該領域最關鍵的挑戰——使語言模型真正具備記憶能力,重新定義了AI記憶的未來。與以往方法不同,MIRIX超越了文本,擁抱豐富的視覺和多模態體驗,使記憶在現實場景中真正發揮作用。MIRIX由六種精心構建的記憶類型組成:核心記憶、情景記憶、語義記憶、程序記憶、資源記憶和知識庫,並配備了一個多代理框架,動態控制和協調更新與檢索。這一設計使得代理能夠持久化、推理並準確地大規模檢索多樣化的長期用戶數據。我們在兩個高要求場景中驗證了MIRIX。首先,在ScreenshotVQA上,這是一個包含每序列近20,000張高分辨率計算機截圖的挑戰性多模態基準測試,需要深度的上下文理解,且現有記憶系統均無法應用,MIRIX相比RAG基線提高了35%的準確率,同時減少了99.9%的存儲需求。其次,在LOCOMO上,這是一個僅有單模態文本輸入的長篇對話基準測試,MIRIX達到了85.4%的頂尖性能,遠超現有基線。這些結果表明,MIRIX為記憶增強的LLM代理樹立了新的性能標準。為了讓用戶體驗我們的記憶系統,我們提供了一個由MIRIX驅動的打包應用程序。它實時監控屏幕,構建個性化記憶庫,並提供直觀的可視化和安全的本地存儲,確保隱私。
English
Although memory capabilities of AI agents are gaining increasing attention, existing solutions remain fundamentally limited. Most rely on flat, narrowly scoped memory components, constraining their ability to personalize, abstract, and reliably recall user-specific information over time. To this end, we introduce MIRIX, a modular, multi-agent memory system that redefines the future of AI memory by solving the field's most critical challenge: enabling language models to truly remember. Unlike prior approaches, MIRIX transcends text to embrace rich visual and multimodal experiences, making memory genuinely useful in real-world scenarios. MIRIX consists of six distinct, carefully structured memory types: Core, Episodic, Semantic, Procedural, Resource Memory, and Knowledge Vault, coupled with a multi-agent framework that dynamically controls and coordinates updates and retrieval. This design enables agents to persist, reason over, and accurately retrieve diverse, long-term user data at scale. We validate MIRIX in two demanding settings. First, on ScreenshotVQA, a challenging multimodal benchmark comprising nearly 20,000 high-resolution computer screenshots per sequence, requiring deep contextual understanding and where no existing memory systems can be applied, MIRIX achieves 35% higher accuracy than the RAG baseline while reducing storage requirements by 99.9%. Second, on LOCOMO, a long-form conversation benchmark with single-modal textual input, MIRIX attains state-of-the-art performance of 85.4%, far surpassing existing baselines. These results show that MIRIX sets a new performance standard for memory-augmented LLM agents. To allow users to experience our memory system, we provide a packaged application powered by MIRIX. It monitors the screen in real time, builds a personalized memory base, and offers intuitive visualization and secure local storage to ensure privacy.
PDF471July 11, 2025