ChatPaper.aiChatPaper

学习面向运行时智能体记忆的查询感知预算分级路由机制

Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory

February 5, 2026
作者: Haozhen Zhang, Haodong Yue, Tao Feng, Quanyu Long, Jianzhu Bao, Bowen Jin, Weizhi Zhang, Xiao Li, Jiaxuan You, Chengwei Qin, Wenya Wang
cs.AI

摘要

随着大语言模型(LLM)智能体操作范围逐渐突破单一上下文窗口的限制,内存的重要性日益凸显。然而,现有系统大多采用离线且与查询无关的内存构建方式,这种方式效率低下且可能丢失关键信息。尽管运行时内存利用是一种自然的替代方案,但先前的研究往往伴随显著开销,且对性能与成本的权衡缺乏显式控制。本文提出BudgetMem——一种支持显式、查询感知的性能成本控制的运行时智能体内存框架。该框架将内存处理构建为若干内存模块,每个模块提供低/中/高三个预算层级。通过轻量级路由器在模块间执行预算层级路由,以平衡任务性能与内存构建成本,该路由机制采用强化学习训练的紧凑神经策略实现。基于BudgetMem这一统一测试平台,我们研究了实现预算层级的三种互补策略:实现方式(方法复杂度)、推理行为(推断模式)和容量配置(模块模型规模)。在LoCoMo、LongMemEval和HotpotQA基准测试中,BudgetMem在优先考虑性能(即高预算设置)时超越强基线模型,并在严格预算限制下提供更优的精度-成本边界。此外,我们的分析揭示了不同层级策略的优劣特性,明确了在不同预算条件下各维度何时能实现最佳权衡。
English
Memory is increasingly central to Large Language Model (LLM) agents operating beyond a single context window, yet most existing systems rely on offline, query-agnostic memory construction that can be inefficient and may discard query-critical information. Although runtime memory utilization is a natural alternative, prior work often incurs substantial overhead and offers limited explicit control over the performance-cost trade-off. In this work, we present BudgetMem, a runtime agent memory framework for explicit, query-aware performance-cost control. BudgetMem structures memory processing as a set of memory modules, each offered in three budget tiers (i.e., Low/Mid/High). A lightweight router performs budget-tier routing across modules to balance task performance and memory construction cost, which is implemented as a compact neural policy trained with reinforcement learning. Using BudgetMem as a unified testbed, we study three complementary strategies for realizing budget tiers: implementation (method complexity), reasoning (inference behavior), and capacity (module model size). Across LoCoMo, LongMemEval, and HotpotQA, BudgetMem surpasses strong baselines when performance is prioritized (i.e., high-budget setting), and delivers better accuracy-cost frontiers under tighter budgets. Moreover, our analysis disentangles the strengths and weaknesses of different tiering strategies, clarifying when each axis delivers the most favorable trade-offs under varying budget regimes.
PDF273February 11, 2026