注意力流域：为何上下文位置在大语言模型中至关重要

摘要

大型語言模型（LLMs）的表現對輸入資訊的上下文位置極為敏感。為探究此位置偏差背後的機制，我們通過大量實驗揭示了一種被稱為“注意力盆地”的現象：當模型面對一系列結構化項目（如檢索到的文檔或少樣本示例）時，系統性地對序列開頭和結尾的項目賦予更高的注意力，而忽視中間部分。關鍵在於，我們的分析進一步表明，將更高的注意力分配給關鍵資訊是提升模型性能的關鍵。基於這些洞察，我們提出了注意力驅動重排序（AttnRank），這是一個兩階段框架，其（i）利用小型校準集估計模型的內在位置注意力偏好，並（ii）重新排序檢索到的文檔或少樣本示例，使最顯著的內容與這些高注意力位置對齊。AttnRank是一種模型無關、無需訓練且即插即用的方法，具有極低的計算開銷。在多跳問答和少樣本上下文學習任務上的實驗表明，AttnRank在10種不同架構和規模的大型語言模型中均實現了顯著改進，且無需修改模型參數或訓練流程。

English

The performance of Large Language Models (LLMs) is significantly sensitive to the contextual position of information in the input. To investigate the mechanism behind this positional bias, our extensive experiments reveal a consistent phenomenon we term the attention basin: when presented with a sequence of structured items (e.g., retrieved documents or few-shot examples), models systematically assign higher attention to the items at the beginning and end of the sequence, while neglecting those in the middle. Crucially, our analysis further reveals that allocating higher attention to critical information is key to enhancing model performance. Based on these insights, we introduce Attention-Driven Reranking (AttnRank), a two-stage framework that (i) estimates a model's intrinsic positional attention preferences using a small calibration set, and (ii) reorders retrieved documents or few-shot examples to align the most salient content with these high-attention positions. AttnRank is a model-agnostic, training-free, and plug-and-play method with minimal computational overhead. Experiments on multi-hop QA and few-shot in-context learning tasks demonstrate that AttnRank achieves substantial improvements across 10 large language models of varying architectures and scales, without modifying model parameters or training procedures.

注意力流域：为何上下文位置在大语言模型中至关重要

Attention Basin: Why Contextual Position Matters in Large Language Models

摘要

Support