迷失於提示順序:揭示語言模型中因果注意力機制的侷限性
Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models
January 20, 2026
作者: Hyunjong Ok, Jaeho Lee
cs.AI
摘要
大型语言模型对提示结构展现出惊人的敏感性,但其内在机制尚未得到充分理解。本研究针对一个典型案例展开深入探讨:在多项选择题回答任务中,将语境置于问题和选项之前的排列方式(CQO)相较逆向排列(QOC)能持续产生超过14个百分点的性能提升,这一现象在不同模型与数据集间具有普适性。通过系统性架构分析,我们发现因果注意力机制是核心成因:在QOC提示中,因果掩码会阻止选项词元关注语境信息,形成语境对选项不可见的信息瓶颈。
English
Large language models exhibit surprising sensitivity to the structure of the prompt, but the mechanisms underlying this sensitivity remain poorly understood. In this work, we conduct an in-depth investigation on a striking case: in multiple-choice question answering, placing context before the questions and options (CQO) outperforms the reverse order (QOC) by over 14%p, consistently over a wide range of models and datasets. Through systematic architectural analysis, we identify causal attention as the core mechanism: in QOC prompts, the causal mask prevents option tokens from attending to context, creating an information bottleneck where context becomes invisible to options.