Omni-SimpleMem: 자율 연구 기반 평생 멀티모달 에이전트 메모리 발견

초록

AI 에이전트의 작동 시간이 점차 길어지고 있으나, 멀티모달 경험을 유지·구조화·재생산하는 능력은 여전히 중요한 병목 현상으로 남아 있습니다. 효과적인 평생 메모리 구축을 위해서는 아키텍처, 검색 전략, 프롬프트 엔지니어링, 데이터 파이프라인에 이르는 방대한 설계 공간을 탐색해야 하는데, 이 공간은 너무 크고 상호 연결되어 있어 수동 탐색이나 기존 AutoML로는 효과적으로 탐구하기 어렵습니다. 우리는 자율 연구 파이프라인을 구축하여 평생 AI 에이전트를 위한 통합 멀티모달 메모리 프레임워크인 Omni-SimpleMem을 발견했습니다. 단순 기초 모델(LoCoMo 기준 F1=0.117)에서 시작하여, 이 파이프라인은 인간의 개입 없이 내부 루프에서 {sim}50회의 실험을 두 벤치마크에 걸쳐 자율 실행하며, 오류 유형을 진단하고 아키텍처 수정을 제안하며 데이터 파이프라인 버그를 수정했습니다. 결과적으로 개발된 시스템은 두 벤치마크에서 최첨단 성능을 달성했으며, 초기 설정 대비 LoCoMo에서 F1 점수를 +411%(0.117→0.598), Mem-Gallery에서 +214%(0.254→0.797) 향상시켰습니다. 중요한 것은 가장 영향력 있는 발견들이 하이퍼파라미터 조정이 아니었다는 점입니다. 버그 수정(+175%), 아키텍처 변경(+44%), 특정 범주 프롬프트 엔지니어링(+188%) 각각이 모든 하이퍼파라미터 튜닝의 누적 기여도를 개별적으로 초과하여, 기존 AutoML의 한계를 근본적으로 넘어서는 능력을 입증했습니다. 우리는 6가지 발견 유형에 대한 분류 체계를 제공하고, 멀티모달 메모리가 자율 연구에 특히 적합하게 만드는 4가지 특성을 규명하여 자율 연구 파이프라ンを 다른 AI 시스템 분야에 적용하기 위한 지침을 제시합니다. 코드는 https://github.com/aiming-lab/SimpleMem에서 확인할 수 있습니다.

English

AI agents increasingly operate over extended time horizons, yet their ability to retain, organize, and recall multimodal experiences remains a critical bottleneck. Building effective lifelong memory requires navigating a vast design space spanning architecture, retrieval strategies, prompt engineering, and data pipelines; this space is too large and interconnected for manual exploration or traditional AutoML to explore effectively. We deploy an autonomous research pipeline to discover Omni-SimpleMem, a unified multimodal memory framework for lifelong AI agents. Starting from a naïve baseline (F1=0.117 on LoCoMo), the pipeline autonomously executes {sim}50 experiments across two benchmarks, diagnosing failure modes, proposing architectural modifications, and repairing data pipeline bugs, all without human intervention in the inner loop. The resulting system achieves state-of-the-art on both benchmarks, improving F1 by +411% on LoCoMo (0.117to0.598) and +214% on Mem-Gallery (0.254to0.797) relative to the initial configurations. Critically, the most impactful discoveries are not hyperparameter adjustments: bug fixes (+175%), architectural changes (+44%), and prompt engineering (+188% on specific categories) each individually exceed the cumulative contribution of all hyperparameter tuning, demonstrating capabilities fundamentally beyond the reach of traditional AutoML. We provide a taxonomy of six discovery types and identify four properties that make multimodal memory particularly suited for autoresearch, offering guidance for applying autonomous research pipelines to other AI system domains. Code is available at this https://github.com/aiming-lab/SimpleMem.

Omni-SimpleMem: 자율 연구 기반 평생 멀티모달 에이전트 메모리 발견

Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

초록

Support