ChatPaper.aiChatPaper

通过上下文折叠实现长程LLM智能体的规模化扩展

Scaling Long-Horizon LLM Agent via Context-Folding

October 13, 2025
作者: Weiwei Sun, Miao Lu, Zhan Ling, Kang Liu, Xuesong Yao, Yiming Yang, Jiecao Chen
cs.AI

摘要

大型语言模型(LLM)代理在处理长期任务时,从根本上受到上下文长度的限制。我们引入了上下文折叠(Context-Folding)框架,该框架赋予代理主动管理工作上下文的能力。代理可以程序性地分支进入子轨迹以处理子任务,并在完成后将其折叠,从而压缩中间步骤,同时保留结果的简明摘要。为了使这种行为可学习,我们开发了一个端到端的强化学习框架FoldGRPO,该框架通过特定的过程奖励来鼓励有效的任务分解和上下文管理。在复杂的长期任务(深度研究和软件工程)上,我们的折叠代理在使用活跃上下文小10倍的情况下,与ReAct基线模型表现相当或更优,并且显著优于依赖基于摘要的上下文管理的模型。
English
Large language model (LLM) agents are fundamentally constrained by context length on long-horizon tasks. We introduce Context-Folding, a framework that empowers agents to actively manage their working context. An agent can procedurally branch into a sub-trajectory to handle a subtask and then fold it upon completion, collapsing the intermediate steps while retaining a concise summary of the outcome. To make this behavior learnable, we develop an end-to-end reinforcement learning framework FoldGRPO with specific process rewards to encourage effective task decomposition and context management. On complex long-horizon tasks (Deep Research and SWE), our folding agent matches or outperforms the ReAct baselines while using an active context 10times smaller and significantly outperforms models that rely on summarization-based context management.
PDF32October 16, 2025