ChatPaper.aiChatPaper

ARC-编码器:面向大语言模型的文本压缩表示学习方法

ARC-Encoder: learning compressed text representations for large language models

October 23, 2025
作者: Hippolyte Pilchen, Edouard Grave, Patrick Pérez
cs.AI

摘要

近年来,检索增强生成和思维链推理等技术导致上下文长度增加及推理成本上升。虽然上下文压缩技术能降低这些成本,但最有效的方法需要对目标模型进行微调甚至修改其架构,这可能会影响模型在非特定用途下的通用能力。本文探索了一种替代方案:通过编码器将上下文压缩为连续表征,以替代解码器型大语言模型中的词元嵌入。首先,我们系统研究了编码器的训练策略与架构选择,据此设计出名为ARC-Encoder的自适应文本表征压缩器,其输出的连续表征数量可比原始文本词元减少x倍(通常x∈{4,8})。我们在指令调优型和基础型解码器上,从上下文学习到上下文窗口扩展等多种大语言模型使用场景中对ARC-Encoder进行评估。结果表明,该编码器在多个基准测试中达到最先进性能,同时提升了推理时的计算效率。最后,我们证明该模型可同时适配多个解码器,实现单一编码器跨不同大语言模型的泛化应用,使ARC-Encoder成为可无缝对接多种大语言模型的便携式高效解决方案。训练代码已发布于https://github.com/kyutai-labs/ARC-Encoder ,微调数据集与预训练模型详见https://huggingface.co/collections/kyutai/arc-encoders-68ee18787301407d60a57047 。
English
Recent techniques such as retrieval-augmented generation or chain-of-thought reasoning have led to longer contexts and increased inference costs. Context compression techniques can reduce these costs, but the most effective approaches require fine-tuning the target model or even modifying its architecture. This can degrade its general abilities when not used for this specific purpose. Here we explore an alternative approach: an encoder that compresses the context into continuous representations which replace token embeddings in decoder LLMs. First, we perform a systematic study of training strategies and architecture choices for the encoder. Our findings led to the design of an Adaptable text Representations Compressor, named ARC-Encoder, which outputs x-times fewer continuous representations (typically x!in!{4,8}) than text tokens. We evaluate ARC-Encoder across a variety of LLM usage scenarios, ranging from in-context learning to context window extension, on both instruct and base decoders. Results show that ARC-Encoder achieves state-of-the-art performance on several benchmarks while improving computational efficiency at inference. Finally, we demonstrate that our models can be adapted to multiple decoders simultaneously, allowing a single encoder to generalize across different decoder LLMs. This makes ARC-Encoder a flexible and efficient solution for portable encoders that work seamlessly with multiple LLMs. We release a training code at https://github.com/kyutai-labs/ARC-Encoder , fine-tuning dataset and pretrained models are available at https://huggingface.co/collections/kyutai/arc-encoders-68ee18787301407d60a57047 .
PDF61December 17, 2025