ACON：面向长程LLM智能体的上下文压缩优化

摘要

大型语言模型（LLMs）正越来越多地被部署为动态现实环境中的智能体，其成功既依赖于推理能力，也离不开高效的工具使用。智能体任务面临的核心挑战是不断增长的上下文长度，因为智能体必须积累长期的动作和观察记录。这种扩展不仅增加了成本，还降低了长期任务的效率，而此前关于上下文压缩的研究大多局限于单步任务或特定应用场景。我们提出了智能体上下文优化（ACON），这是一个统一的框架，能够将环境观察和交互历史最优地压缩为简洁而信息丰富的摘要。ACON利用自然语言空间中的压缩指南优化：在完整上下文成功而压缩上下文失败的成对轨迹中，强大的LLMs分析失败原因，并据此更新压缩指南。此外，我们建议将优化后的LLM压缩器蒸馏到更小的模型中，以减少额外模块的开销。在AppWorld、OfficeBench和多目标问答上的实验表明，ACON在显著保持任务性能的同时，减少了26-54%的内存使用（峰值token数），当蒸馏到更小的压缩器时保持了超过95%的准确率，并作为长期智能体提升了较小语言模型的性能，最高提升达46%。

English

Large language models (LLMs) are increasingly deployed as agents in dynamic, real-world environments, where success requires both reasoning and effective tool use. A central challenge for agentic tasks is the growing context length, as agents must accumulate long histories of actions and observations. This expansion raises costs and reduces efficiency in long-horizon tasks, yet prior work on context compression has mostly focused on single-step tasks or narrow applications. We introduce Agent Context Optimization (ACON), a unified framework that optimally compresses both environment observations and interaction histories into concise yet informative condensations. ACON leverages compression guideline optimization in natural language space: given paired trajectories where full context succeeds but compressed context fails, capable LLMs analyze the causes of failure, and the compression guideline is updated accordingly. Furthermore, we propose distilling the optimized LLM compressor into smaller models to reduce the overhead of the additional module. Experiments on AppWorld, OfficeBench, and Multi-objective QA show that ACON reduces memory usage by 26-54% (peak tokens) while largely preserving task performance, preserves over 95% of accuracy when distilled into smaller compressors, and enhances smaller LMs as long-horizon agents with up to 46% performance improvement.

ACON：面向长程LLM智能体的上下文压缩优化

ACON: Optimizing Context Compression for Long-horizon LLM Agents

摘要

Support