ChatPaper.aiChatPaper

雕塑家:通过主动上下文管理赋予大语言模型认知能动性

Sculptor: Empowering LLMs with Cognitive Agency via Active Context Management

August 6, 2025
作者: Mo Li, L. H. Xu, Qitai Tan, Ting Cao, Yunxin Liu
cs.AI

摘要

大型语言模型(LLMs)在处理长上下文时,由于前摄干扰的存在,性能显著下降,即上下文前部的不相关信息干扰了推理和记忆提取。尽管大多数研究集中于通过外部记忆系统增强LLMs的能力,我们提出了一种补充性策略:赋予LLMs主动上下文管理(ACM)工具,以主动塑造其内部工作记忆。我们介绍了Sculptor框架,该框架为LLMs配备了三大类工具:(1)上下文分割,(2)摘要、隐藏与恢复,以及(3)智能搜索。我们的方法使LLMs能够主动管理其注意力与工作记忆,类似于人类如何选择性聚焦于相关信息而过滤掉干扰。在信息稀疏的基准测试——PI-LLM(前摄干扰)和NeedleBench多针推理上的实验评估表明,即使未经专门训练,Sculptor也能显著提升性能,这得益于LLMs固有的工具调用泛化能力。通过实现主动上下文管理,Sculptor不仅缓解了前摄干扰,还为跨多样长上下文任务的更可靠推理提供了认知基础——强调明确的上下文控制策略,而非单纯扩大令牌窗口,是实现大规模稳健性的关键。
English
Large Language Models (LLMs) suffer from significant performance degradation when processing long contexts due to proactive interference, where irrelevant information in earlier parts of the context disrupts reasoning and memory recall. While most research focuses on external memory systems to augment LLMs' capabilities, we propose a complementary approach: empowering LLMs with Active Context Management (ACM) tools to actively sculpt their internal working memory. We introduce Sculptor, a framework that equips LLMs with three categories of tools: (1) context fragmentation, (2) summary, hide, and restore, and (3) intelligent search. Our approach enables LLMs to proactively manage their attention and working memory, analogous to how humans selectively focus on relevant information while filtering out distractions. Experimental evaluation on information-sparse benchmarks-PI-LLM (proactive interference) and NeedleBench Multi-Needle Reasoning-demonstrates that Sculptor significantly improves performance even without specific training, leveraging LLMs' inherent tool calling generalization capabilities. By enabling Active Context Management, Sculptor not only mitigates proactive interference but also provides a cognitive foundation for more reliable reasoning across diverse long-context tasks-highlighting that explicit context-control strategies, rather than merely larger token windows, are key to robustness at scale.
PDF102August 7, 2025