MemFly：基于信息瓶颈的即时内存优化

摘要

长期记忆使大语言模型智能体能够通过历史交互处理复杂任务。然而，现有框架在高效压缩冗余信息与保持下游任务精确检索之间面临根本性矛盾。为弥补这一鸿沟，我们提出基于信息瓶颈原理的MemFly框架，实现大语言模型的实时记忆演化。该方法通过无梯度优化器最小化压缩熵的同时最大化关联熵，构建分层记忆结构以实现高效存储。为充分发挥MemFly效能，我们开发了混合检索机制，无缝集成语义、符号和拓扑检索路径，并引入迭代优化以处理复杂多跳查询。综合实验表明，MemFly在记忆连贯性、响应保真度与准确性方面显著优于现有最优基线模型。

English

Long-term memory enables large language model agents to tackle complex tasks through historical interactions. However, existing frameworks encounter a fundamental dilemma between compressing redundant information efficiently and maintaining precise retrieval for downstream tasks. To bridge this gap, we propose MemFly, a framework grounded in information bottleneck principles that facilitates on-the-fly memory evolution for LLMs. Our approach minimizes compression entropy while maximizing relevance entropy via a gradient-free optimizer, constructing a stratified memory structure for efficient storage. To fully leverage MemFly, we develop a hybrid retrieval mechanism that seamlessly integrates semantic, symbolic, and topological pathways, incorporating iterative refinement to handle complex multi-hop queries. Comprehensive experiments demonstrate that MemFly substantially outperforms state-of-the-art baselines in memory coherence, response fidelity, and accuracy.