ChatPaper.aiChatPaper

SWE-Pruner:面向编程代理的自适应上下文剪枝方法

SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents

January 23, 2026
作者: Yuhang Wang, Yuling Shi, Mo Yang, Rongrui Zhang, Shilin He, Heng Lian, Yuting Chen, Siyu Ye, Kai Cai, Xiaodong Gu
cs.AI

摘要

大型语言模型智能体在软件开发领域展现出卓越能力,但其性能受限于长交互上下文导致的高API成本与延迟。尽管已有LongLLMLingua等多种上下文压缩方法应对这一挑战,但这些方法通常依赖PPL等固定指标,忽略了代码理解的任务特定性。这往往导致语法逻辑结构被破坏,关键实现细节丢失。本文提出SWE-Pruner——专为编码智能体设计的自适应上下文剪枝框架。受程序员在开发调试过程中"选择性略读"源代码的启发,SWE-Pruner对长上下文执行任务感知的自适应剪枝。面对当前任务,智能体会将显式目标(如"关注错误处理")转化为提示词来指导剪枝目标。我们训练了一个轻量级神经略读器(0.6B参数),能够根据目标动态筛选上下文中的相关代码行。在四个基准测试和多个模型上的评估表明,SWE-Pruner在各种场景中均表现优异:在SWE-Bench Verified等智能体任务上实现23-54%的token缩减,在LongCodeQA等单轮任务中达到14.84倍压缩比,且对性能影响微乎其微。
English
LLM agents have demonstrated remarkable capabilities in software development, but their performance is hampered by long interaction contexts, which incur high API costs and latency. While various context compression approaches such as LongLLMLingua have emerged to tackle this challenge, they typically rely on fixed metrics such as PPL, ignoring the task-specific nature of code understanding. As a result, they frequently disrupt syntactic and logical structure and fail to retain critical implementation details. In this paper, we propose SWE-Pruner, a self-adaptive context pruning framework tailored for coding agents. Drawing inspiration from how human programmers "selectively skim" source code during development and debugging, SWE-Pruner performs task-aware adaptive pruning for long contexts. Given the current task, the agent formulates an explicit goal (e.g., "focus on error handling") as a hint to guide the pruning targets. A lightweight neural skimmer (0.6B parameters) is trained to dynamically select relevant lines from the surrounding context given the goal. Evaluations across four benchmarks and multiple models validate SWE-Pruner's effectiveness in various scenarios, achieving 23-54% token reduction on agent tasks like SWE-Bench Verified and up to 14.84x compression on single-turn tasks like LongCodeQA with minimal performance impact.
PDF713January 27, 2026