雕塑家：通過主動上下文管理賦能大型語言模型認知主體性

摘要

大型語言模型（LLMs）在處理長上下文時，由於前攝干擾（proactive interference）的存在，會遭受顯著的性能下降，即上下文前段中的不相關信息會干擾推理和記憶提取。儘管大多數研究專注於通過外部記憶系統來增強LLMs的能力，我們提出了一種互補的方法：賦予LLMs主動上下文管理（Active Context Management, ACM）工具，以主動塑造其內部工作記憶。我們介紹了Sculptor框架，該框架為LLMs配備了三類工具：(1) 上下文分段，(2) 摘要、隱藏與恢復，以及(3) 智能搜索。我們的方法使LLMs能夠主動管理其注意力和工作記憶，類似於人類如何選擇性地聚焦於相關信息並過濾掉干擾。在信息稀疏的基準測試——PI-LLM（前攝干擾）和NeedleBench多針推理上的實驗評估表明，即使沒有特定訓練，Sculptor也能顯著提升性能，這得益於LLMs固有的工具調用泛化能力。通過實現主動上下文管理，Sculptor不僅緩解了前攝干擾，還為跨多樣長上下文任務的更可靠推理提供了認知基礎——這表明，明確的上下文控制策略，而非僅僅更大的token窗口，是實現大規模魯棒性的關鍵。

English

Large Language Models (LLMs) suffer from significant performance degradation when processing long contexts due to proactive interference, where irrelevant information in earlier parts of the context disrupts reasoning and memory recall. While most research focuses on external memory systems to augment LLMs' capabilities, we propose a complementary approach: empowering LLMs with Active Context Management (ACM) tools to actively sculpt their internal working memory. We introduce Sculptor, a framework that equips LLMs with three categories of tools: (1) context fragmentation, (2) summary, hide, and restore, and (3) intelligent search. Our approach enables LLMs to proactively manage their attention and working memory, analogous to how humans selectively focus on relevant information while filtering out distractions. Experimental evaluation on information-sparse benchmarks-PI-LLM (proactive interference) and NeedleBench Multi-Needle Reasoning-demonstrates that Sculptor significantly improves performance even without specific training, leveraging LLMs' inherent tool calling generalization capabilities. By enabling Active Context Management, Sculptor not only mitigates proactive interference but also provides a cognitive foundation for more reliable reasoning across diverse long-context tasks-highlighting that explicit context-control strategies, rather than merely larger token windows, are key to robustness at scale.

雕塑家：通過主動上下文管理賦能大型語言模型認知主體性

Sculptor: Empowering LLMs with Cognitive Agency via Active Context Management

摘要

Support