ChatPaper.aiChatPaper

学习查询感知的预算分级路由以实现运行时智能体记忆优化

Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory

February 5, 2026
作者: Haozhen Zhang, Haodong Yue, Tao Feng, Quanyu Long, Jianzhu Bao, Bowen Jin, Weizhi Zhang, Xiao Li, Jiaxuan You, Chengwei Qin, Wenya Wang
cs.AI

摘要

随着大型语言模型(LLM)智能体的运作范围逐渐超越单一上下文窗口,记忆功能日益成为核心要素。然而现有系统多采用离线且与查询无关的记忆构建方式,这种方式不仅效率低下,还可能丢失查询关键信息。尽管运行时记忆调用是自然替代方案,但既有方案往往产生显著开销,且对性能与成本的权衡缺乏显式控制。本研究提出BudgetMem——一种支持显式、查询感知型性能成本控制的运行时智能体记忆框架。该框架将记忆处理构建为若干记忆模块,每个模块提供低/中/高三种预算层级。通过轻量级路由器执行跨模块的预算层级路由,在任务性能与记忆构建成本间实现平衡,这一机制通过强化学习训练的紧凑神经策略实现。借助BudgetMem这一统一测试平台,我们研究了实现预算层级的三种互补策略:实现方式(方法复杂度)、推理机制(推断行为)和容量配置(模块模型规模)。在LoCoMo、LongMemEval和HotpotQA数据集上的实验表明,BudgetMem在优先考虑性能(即高预算设置)时超越强基线模型,在严格预算限制下能提供更优的精度-成本边界。此外,我们的分析揭示了不同层级策略的优劣特性,明确了在不同预算模式下各维度何时能实现最佳权衡效果。
English
Memory is increasingly central to Large Language Model (LLM) agents operating beyond a single context window, yet most existing systems rely on offline, query-agnostic memory construction that can be inefficient and may discard query-critical information. Although runtime memory utilization is a natural alternative, prior work often incurs substantial overhead and offers limited explicit control over the performance-cost trade-off. In this work, we present BudgetMem, a runtime agent memory framework for explicit, query-aware performance-cost control. BudgetMem structures memory processing as a set of memory modules, each offered in three budget tiers (i.e., Low/Mid/High). A lightweight router performs budget-tier routing across modules to balance task performance and memory construction cost, which is implemented as a compact neural policy trained with reinforcement learning. Using BudgetMem as a unified testbed, we study three complementary strategies for realizing budget tiers: implementation (method complexity), reasoning (inference behavior), and capacity (module model size). Across LoCoMo, LongMemEval, and HotpotQA, BudgetMem surpasses strong baselines when performance is prioritized (i.e., high-budget setting), and delivers better accuracy-cost frontiers under tighter budgets. Moreover, our analysis disentangles the strengths and weaknesses of different tiering strategies, clarifying when each axis delivers the most favorable trade-offs under varying budget regimes.
PDF273February 11, 2026