消除語言模型的位置偏見:一種機制性方法
Eliminating Position Bias of Language Models: A Mechanistic Approach
July 1, 2024
作者: Ziqi Wang, Hanlin Zhang, Xiner Li, Kuan-Hao Huang, Chi Han, Shuiwang Ji, Sham M. Kakade, Hao Peng, Heng Ji
cs.AI
摘要
現代語言模型(LM)中已被證實存在的一個普遍問題是位置偏差,即模型根據給定上下文中的位置優先處理內容。這種偏差通常導致意外的模型失敗,損害各種應用中的性能、韌性和可靠性。我們的機械分析將位置偏差歸因於幾乎所有最先進的LM中使用的兩個組件:因果關注和相對位置編碼。具體而言,我們發現因果關注通常會使模型偏好遠處的內容,而像RoPE這樣的相對位置編碼則根據檢索增強問答(QA)的分析偏好附近的內容。此外,我們對目標檢測的實證研究顯示,位置偏差也存在於視覺語言模型(VLM)中。
基於上述分析,我們提出以零訓練方式消除因不同輸入片段順序(例如LM作為評判中的選項、QA中檢索的文檔)引起的位置偏差。我們的方法將因果關注改為片段之間的雙向關注,並利用模型關注值來決定片段的相對順序,而不是使用輸入提示中提供的順序,從而實現片段級的位置不變推理(PINE)。通過消除位置偏差,模型在存在廣泛位置偏差的下游任務中(例如LM作為評判和檢索增強QA)實現更好的性能和可靠性。
值得注意的是,當適應LM以評估推理對時,PINE尤其有用:在大多數情況下,它始終提供8至10個百分點的性能增益,並使Llama-3-70B-Instruct在RewardBench推理子集上的表現甚至優於GPT-4-0125-preview。
English
Position bias has proven to be a prevalent issue of modern language models
(LMs), where the models prioritize content based on its position within the
given context. This bias often leads to unexpected model failures and hurts
performance, robustness, and reliability across various applications. Our
mechanistic analysis attributes the position bias to two components employed in
nearly all state-of-the-art LMs: causal attention and relative positional
encodings. Specifically, we find that causal attention generally causes models
to favor distant content, while relative positional encodings like RoPE prefer
nearby ones based on the analysis of retrieval-augmented question answering
(QA). Further, our empirical study on object detection reveals that position
bias is also present in vision-language models (VLMs).
Based on the above analyses, we propose to ELIMINATE position bias caused by
different input segment orders (e.g., options in LM-as-a-judge, retrieved
documents in QA) in a TRAINING-FREE ZERO-SHOT manner. Our method changes the
causal attention to bidirectional attention between segments and utilizes model
attention values to decide the relative orders of segments instead of using the
order provided in input prompts, therefore enabling Position-INvariant
inferencE (PINE) at the segment level. By eliminating position bias, models
achieve better performance and reliability in downstream tasks where position
bias widely exists, such as LM-as-a-judge and retrieval-augmented QA.
Notably, PINE is especially useful when adapting LMs for evaluating reasoning
pairs: it consistently provides 8 to 10 percentage points performance gains in
most cases, and makes Llama-3-70B-Instruct perform even better than
GPT-4-0125-preview on the RewardBench reasoning subset.Summary
AI-Generated Summary