雕塑家:通過主動上下文管理賦能大型語言模型認知主體性
Sculptor: Empowering LLMs with Cognitive Agency via Active Context Management
August 6, 2025
作者: Mo Li, L. H. Xu, Qitai Tan, Ting Cao, Yunxin Liu
cs.AI
摘要
大型語言模型(LLMs)在處理長上下文時,由於前攝干擾(proactive interference)的存在,會遭受顯著的性能下降,即上下文前段中的不相關信息會干擾推理和記憶提取。儘管大多數研究專注於通過外部記憶系統來增強LLMs的能力,我們提出了一種互補的方法:賦予LLMs主動上下文管理(Active Context Management, ACM)工具,以主動塑造其內部工作記憶。我們介紹了Sculptor框架,該框架為LLMs配備了三類工具:(1) 上下文分段,(2) 摘要、隱藏與恢復,以及(3) 智能搜索。我們的方法使LLMs能夠主動管理其注意力和工作記憶,類似於人類如何選擇性地聚焦於相關信息並過濾掉干擾。在信息稀疏的基準測試——PI-LLM(前攝干擾)和NeedleBench多針推理上的實驗評估表明,即使沒有特定訓練,Sculptor也能顯著提升性能,這得益於LLMs固有的工具調用泛化能力。通過實現主動上下文管理,Sculptor不僅緩解了前攝干擾,還為跨多樣長上下文任務的更可靠推理提供了認知基礎——這表明,明確的上下文控制策略,而非僅僅更大的token窗口,是實現大規模魯棒性的關鍵。
English
Large Language Models (LLMs) suffer from significant performance degradation
when processing long contexts due to proactive interference, where irrelevant
information in earlier parts of the context disrupts reasoning and memory
recall. While most research focuses on external memory systems to augment LLMs'
capabilities, we propose a complementary approach: empowering LLMs with Active
Context Management (ACM) tools to actively sculpt their internal working
memory. We introduce Sculptor, a framework that equips LLMs with three
categories of tools: (1) context fragmentation, (2) summary, hide, and restore,
and (3) intelligent search. Our approach enables LLMs to proactively manage
their attention and working memory, analogous to how humans selectively focus
on relevant information while filtering out distractions. Experimental
evaluation on information-sparse benchmarks-PI-LLM (proactive interference) and
NeedleBench Multi-Needle Reasoning-demonstrates that Sculptor significantly
improves performance even without specific training, leveraging LLMs' inherent
tool calling generalization capabilities. By enabling Active Context
Management, Sculptor not only mitigates proactive interference but also
provides a cognitive foundation for more reliable reasoning across diverse
long-context tasks-highlighting that explicit context-control strategies,
rather than merely larger token windows, are key to robustness at scale.