言語モデルの位置バイアスを排除する：メカニズムに基づくアプローチ

要旨

位置バイアスは、現代の言語モデル（LMs）において広く見られる問題であることが証明されており、モデルが与えられたコンテキスト内での位置に基づいてコンテンツを優先する傾向があります。このバイアスは、予期せぬモデルの失敗を引き起こし、さまざまなアプリケーションにおける性能、堅牢性、信頼性を損なうことがしばしばあります。私たちのメカニズム分析によると、位置バイアスは、ほぼすべての最先端のLMsで採用されている2つのコンポーネントに起因しています：因果的注意（causal attention）と相対的位置エンコーディング（relative positional encodings）です。具体的には、因果的注意は一般的にモデルに遠くのコンテンツを優先させる傾向があり、RoPEなどの相対的位置エンコーディングは近くのコンテンツを優先することを、検索拡張型質問応答（QA）の分析に基づいて明らかにしました。さらに、物体検出に関する私たちの実証研究は、視覚言語モデル（VLMs）にも位置バイアスが存在することを示しています。上記の分析に基づき、私たちは、異なる入力セグメントの順序（例：LM-as-a-judgeにおけるオプション、QAにおける検索されたドキュメント）によって引き起こされる位置バイアスを、トレーニング不要のゼロショット方式で排除することを提案します。私たちの方法は、セグメント間の因果的注意を双方向注意に変更し、入力プロンプトで提供された順序ではなく、モデルの注意値を使用してセグメントの相対的な順序を決定します。これにより、セグメントレベルでの位置不変推論（Position-INvariant inferencE, PINE）を可能にします。位置バイアスを排除することで、LM-as-a-judgeや検索拡張型QAなど、位置バイアスが広く存在する下流タスクにおいて、モデルの性能と信頼性が向上します。特に、PINEは、LMsを推論ペアの評価に適応させる際に非常に有用です：ほとんどの場合で8から10パーセントポイントの性能向上を一貫して提供し、Llama-3-70B-InstructをRewardBenchの推論サブセットにおいてGPT-4-0125-previewよりも優れた性能に導きます。

English

Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness, and reliability across various applications. Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of-the-art LMs: causal attention and relative positional encodings. Specifically, we find that causal attention generally causes models to favor distant content, while relative positional encodings like RoPE prefer nearby ones based on the analysis of retrieval-augmented question answering (QA). Further, our empirical study on object detection reveals that position bias is also present in vision-language models (VLMs). Based on the above analyses, we propose to ELIMINATE position bias caused by different input segment orders (e.g., options in LM-as-a-judge, retrieved documents in QA) in a TRAINING-FREE ZERO-SHOT manner. Our method changes the causal attention to bidirectional attention between segments and utilizes model attention values to decide the relative orders of segments instead of using the order provided in input prompts, therefore enabling Position-INvariant inferencE (PINE) at the segment level. By eliminating position bias, models achieve better performance and reliability in downstream tasks where position bias widely exists, such as LM-as-a-judge and retrieval-augmented QA. Notably, PINE is especially useful when adapting LMs for evaluating reasoning pairs: it consistently provides 8 to 10 percentage points performance gains in most cases, and makes Llama-3-70B-Instruct perform even better than GPT-4-0125-preview on the RewardBench reasoning subset.

言語モデルの位置バイアスを排除する：メカニズムに基づくアプローチ

Eliminating Position Bias of Language Models: A Mechanistic Approach

要旨

Support