ChatPaper.aiChatPaper

Omni-SimpleMem:基于自主研究引导的终身多模态智能体记忆发现

Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

April 2, 2026
作者: Jiaqi Liu, Zipeng Ling, Shi Qiu, Yanqing Liu, Siwei Han, Peng Xia, Haoqin Tu, Zeyu Zheng, Cihang Xie, Charles Fleming, Mingyu Ding, Huaxiu Yao
cs.AI

摘要

随着AI智能体在长时间跨度中的运作日益普遍,其保留、组织和调用多模态经验的能力仍是关键瓶颈。构建有效的终身记忆系统需要跨越架构设计、检索策略、提示工程和数据管道等庞大设计空间,这一空间过于复杂且相互关联,难以通过人工探索或传统自动机器学习进行有效优化。我们部署了自主研究流水线,由此发现了Omni-SimpleMem——一种面向终身AI智能体的统一多模态记忆框架。从初始基线(LoCoMo基准F1=0.117)出发,该流水线在无人干预的情况下自主执行了约50组实验,涵盖两大测试基准,实现了故障模式诊断、架构改进方案提出及数据管道错误修复的全流程自动化。最终系统在两项基准测试中均达到最先进水平:相较于初始配置,LoCoMo的F1值提升411%(0.117→0.598),Mem-Gallery提升214%(0.254→0.797)。关键发现在于,最具影响力的改进并非来自超参数调整:错误修复(+175%)、架构变更(+44%)和特定类别的提示工程优化(+188%)各自贡献均超越所有超参数调优的总和,这体现了传统自动机器学习无法企及的能力突破。我们提出了六类发现类型的分类体系,并总结了使多模态记忆特别适合自主研究的四大特性,为将自主研究流水线应用于其他AI系统领域提供指引。代码已发布于:https://github.com/aiming-lab/SimpleMem。
English
AI agents increasingly operate over extended time horizons, yet their ability to retain, organize, and recall multimodal experiences remains a critical bottleneck. Building effective lifelong memory requires navigating a vast design space spanning architecture, retrieval strategies, prompt engineering, and data pipelines; this space is too large and interconnected for manual exploration or traditional AutoML to explore effectively. We deploy an autonomous research pipeline to discover Omni-SimpleMem, a unified multimodal memory framework for lifelong AI agents. Starting from a naïve baseline (F1=0.117 on LoCoMo), the pipeline autonomously executes {sim}50 experiments across two benchmarks, diagnosing failure modes, proposing architectural modifications, and repairing data pipeline bugs, all without human intervention in the inner loop. The resulting system achieves state-of-the-art on both benchmarks, improving F1 by +411% on LoCoMo (0.117to0.598) and +214% on Mem-Gallery (0.254to0.797) relative to the initial configurations. Critically, the most impactful discoveries are not hyperparameter adjustments: bug fixes (+175%), architectural changes (+44%), and prompt engineering (+188% on specific categories) each individually exceed the cumulative contribution of all hyperparameter tuning, demonstrating capabilities fundamentally beyond the reach of traditional AutoML. We provide a taxonomy of six discovery types and identify four properties that make multimodal memory particularly suited for autoresearch, offering guidance for applying autonomous research pipelines to other AI system domains. Code is available at this https://github.com/aiming-lab/SimpleMem.
PDF171April 4, 2026