LongCodeZip：針對程式語言模型的長上下文壓縮技術

摘要

在大型代码库中，随着大型语言模型（LLMs）需要处理日益广泛的信息，长上下文下的代码生成变得愈发关键。尽管近期的进展使得代码LLMs能够处理长输入，但高昂的API成本和生成延迟仍然是显著的瓶颈。现有的上下文修剪技术，如LLMLingua，在通用文本上取得了令人瞩目的成果，却忽视了代码特有的结构和依赖关系，导致在编程任务中表现欠佳。本文提出LongCodeZip，一种专为代码LLMs设计的新型即插即用代码压缩框架。LongCodeZip采用双阶段策略：（1）粗粒度压缩，通过条件困惑度识别并排序与指令相关的函数级代码块，仅保留最相关的函数；（2）细粒度压缩，将保留的函数基于困惑度分割为块，并在自适应令牌预算下选择最优子集，以最大化相关性。在包括代码补全、摘要和问答在内的多项任务评估中，LongCodeZip持续超越基线方法，实现了高达5.6倍的压缩比，且不降低任务性能。通过有效减少上下文大小同时保留关键信息，LongCodeZip使LLMs能够更好地扩展到现实世界的大规模代码场景，提升了代码智能应用的效率和能力。

English

Code generation under long contexts is becoming increasingly critical as Large Language Models (LLMs) are required to reason over extensive information in the codebase. While recent advances enable code LLMs to process long inputs, high API costs and generation latency remain substantial bottlenecks. Existing context pruning techniques, such as LLMLingua, achieve promising results for general text but overlook code-specific structures and dependencies, leading to suboptimal performance in programming tasks. In this paper, we propose LongCodeZip, a novel plug-and-play code compression framework designed specifically for code LLMs. LongCodeZip employs a dual-stage strategy: (1) coarse-grained compression, which identifies and ranks function-level chunks using conditional perplexity with respect to the instruction, retaining only the most relevant functions; and (2) fine-grained compression, which segments retained functions into blocks based on perplexity and selects an optimal subset under an adaptive token budget to maximize relevance. Evaluations across multiple tasks, including code completion, summarization, and question answering, show that LongCodeZip consistently outperforms baseline methods, achieving up to a 5.6x compression ratio without degrading task performance. By effectively reducing context size while preserving essential information, LongCodeZip enables LLMs to better scale to real-world, large-scale code scenarios, advancing the efficiency and capability of code intelligence applications.

LongCodeZip：針對程式語言模型的長上下文壓縮技術

LongCodeZip: Compress Long Context for Code Language Models

摘要

Support