混合機制：語言模型如何檢索綁定實體上下文內

摘要

情境推理的一个关键组成部分是语言模型（LMs）绑定实体以供后续检索的能力。例如，一个LM可能通过将“Ann”与“pie”绑定来表示“Ann loves pie”，从而在询问“谁喜欢pie？”时能够检索到“Ann”。先前关于绑定实体短列表的研究发现，有强有力的证据表明LMs通过一种位置机制实现此类检索，即基于“Ann”在上下文中的位置进行检索。在本研究中，我们发现这种机制在更复杂的环境中泛化能力较差；随着上下文中绑定实体数量的增加，位置机制在中间位置变得嘈杂且不可靠。为了弥补这一点，我们发现LMs通过词汇机制（使用其绑定对应物“pie”检索“Ann”）和反射机制（通过直接指针检索“Ann”）来补充位置机制。通过对九个模型和十项绑定任务的广泛实验，我们揭示了LMs如何混合这些机制以驱动模型行为的一致模式。我们利用这些见解开发了一个结合所有三种机制的因果模型，该模型在估计下一个标记分布时达到了95%的一致性。最后，我们展示了我们的模型能够泛化到包含实体组的开放式文本的显著更长输入中，进一步证明了我们的发现在更自然环境中的鲁棒性。总体而言，我们的研究为LMs如何在情境中绑定和检索实体提供了一个更完整的图景。

English

A key component of in-context reasoning is the ability of language models (LMs) to bind entities for later retrieval. For example, an LM might represent "Ann loves pie" by binding "Ann" to "pie", allowing it to later retrieve "Ann" when asked "Who loves pie?" Prior research on short lists of bound entities found strong evidence that LMs implement such retrieval via a positional mechanism, where "Ann" is retrieved based on its position in context. In this work, we find that this mechanism generalizes poorly to more complex settings; as the number of bound entities in context increases, the positional mechanism becomes noisy and unreliable in middle positions. To compensate for this, we find that LMs supplement the positional mechanism with a lexical mechanism (retrieving "Ann" using its bound counterpart "pie") and a reflexive mechanism (retrieving "Ann" through a direct pointer). Through extensive experiments on nine models and ten binding tasks, we uncover a consistent pattern in how LMs mix these mechanisms to drive model behavior. We leverage these insights to develop a causal model combining all three mechanisms that estimates next token distributions with 95% agreement. Finally, we show that our model generalizes to substantially longer inputs of open-ended text interleaved with entity groups, further demonstrating the robustness of our findings in more natural settings. Overall, our study establishes a more complete picture of how LMs bind and retrieve entities in-context.

混合機制：語言模型如何檢索綁定實體上下文內

Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context

摘要

Support