注意力盆地：为何上下文位置在大语言模型中至关重要

摘要

大型语言模型（LLMs）的性能对输入信息在上下文中的位置极为敏感。为探究这种位置偏差背后的机制，我们通过大量实验揭示了一个一致的现象，称之为“注意力盆地”：当模型面对一系列结构化项目（如检索到的文档或少样本示例）时，系统性地对序列开头和结尾的项目赋予更高注意力，而忽视中间部分。关键的是，我们的分析进一步表明，将更高注意力分配给关键信息是提升模型性能的核心。基于这些洞察，我们提出了注意力驱动重排序（AttnRank），一个两阶段框架，它首先利用少量校准集估计模型内在的位置注意力偏好，然后重新排列检索到的文档或少样本示例，使最显著的内容与这些高注意力位置对齐。AttnRank是一种模型无关、无需训练、即插即用的方法，计算开销极小。在多跳问答和少样本上下文学习任务上的实验表明，AttnRank在10种不同架构和规模的大型语言模型上均实现了显著改进，且无需修改模型参数或训练流程。

English

The performance of Large Language Models (LLMs) is significantly sensitive to the contextual position of information in the input. To investigate the mechanism behind this positional bias, our extensive experiments reveal a consistent phenomenon we term the attention basin: when presented with a sequence of structured items (e.g., retrieved documents or few-shot examples), models systematically assign higher attention to the items at the beginning and end of the sequence, while neglecting those in the middle. Crucially, our analysis further reveals that allocating higher attention to critical information is key to enhancing model performance. Based on these insights, we introduce Attention-Driven Reranking (AttnRank), a two-stage framework that (i) estimates a model's intrinsic positional attention preferences using a small calibration set, and (ii) reorders retrieved documents or few-shot examples to align the most salient content with these high-attention positions. AttnRank is a model-agnostic, training-free, and plug-and-play method with minimal computational overhead. Experiments on multi-hop QA and few-shot in-context learning tasks demonstrate that AttnRank achieves substantial improvements across 10 large language models of varying architectures and scales, without modifying model parameters or training procedures.

注意力盆地：为何上下文位置在大语言模型中至关重要

Attention Basin: Why Contextual Position Matters in Large Language Models

摘要

Support