SimpleMem: Efficiënt levenslang geheugen voor LLM-agenten

Samenvatting

Om betrouwbare langetermijninteractie in complexe omgevingen te ondersteunen, hebben LLM-agents geheugensystemen nodig die historische ervaringen efficiënt beheren. Bestaande benaderingen behouden ofwel volledige interactiegeschiedenissen via passieve contextuitbreiding, wat tot aanzienlijke redundantie leidt, of vertrouwen op iteratief redeneren om ruis te filteren, wat hoge tokenkosten met zich meebrengt. Om deze uitdaging aan te pakken, introduceren wij SimpleMem, een efficiënt geheugenraamwerk gebaseerd op semantische verliesloze compressie. Wij stellen een pijplijn in drie fasen voor, ontworpen om informatiedichtheid en tokenbenutting te maximaliseren: (1) Semantische Gestructureerde Compressie, dat entropiebewust filteren toepast om ongestructureerde interacties te destilleren tot compacte, multi-view geïndexeerde geheugeneenheden; (2) Recursieve Geheugenconsolidatie, een asynchroon proces dat verwante eenheden integreert in abstractere representaties op hoger niveau om redundantie te verminderen; en (3) Adaptieve Query-Aware Retrieval, dat het retrievalscope dynamisch aanpast op basis van querycomplexiteit om efficiënt precieze context te construeren. Experimenten op benchmarkdatasets tonen aan dat onze methode consistente betere prestaties levert dan baseline-benaderingen in nauwkeurigheid, retrievalefficiëntie en inferentiekosten, met een gemiddelde F1-verbetering van 26,4% terwijl de tokenconsumptie tijdens inferentie tot 30-voudig wordt gereduceerd. Dit demonstreert een superieure balans tussen prestaties en efficiëntie. Code is beschikbaar op https://github.com/aiming-lab/SimpleMem.

English

To support reliable long-term interaction in complex environments, LLM agents require memory systems that efficiently manage historical experiences. Existing approaches either retain full interaction histories via passive context extension, leading to substantial redundancy, or rely on iterative reasoning to filter noise, incurring high token costs. To address this challenge, we introduce SimpleMem, an efficient memory framework based on semantic lossless compression. We propose a three-stage pipeline designed to maximize information density and token utilization: (1) Semantic Structured Compression, which applies entropy-aware filtering to distill unstructured interactions into compact, multi-view indexed memory units; (2) Recursive Memory Consolidation, an asynchronous process that integrates related units into higher-level abstract representations to reduce redundancy; and (3) Adaptive Query-Aware Retrieval, which dynamically adjusts retrieval scope based on query complexity to construct precise context efficiently. Experiments on benchmark datasets show that our method consistently outperforms baseline approaches in accuracy, retrieval efficiency, and inference cost, achieving an average F1 improvement of 26.4% while reducing inference-time token consumption by up to 30-fold, demonstrating a superior balance between performance and efficiency. Code is available at https://github.com/aiming-lab/SimpleMem.

SimpleMem: Efficiënt levenslang geheugen voor LLM-agenten

SimpleMem: Efficient Lifelong Memory for LLM Agents

Samenvatting

Support