Transformer中的幾何事實召回

摘要

Transformer語言模型如何記憶事實關聯？一種常見觀點將內部權重矩陣視為嵌入對之間的聯想記憶，要求參數數量隨事實數量線性增長。我們發展了一套理論與實證分析，提出另一種幾何形式的記憶機制：學習到的嵌入直接編碼關係結構，而多層感知器（MLP）則扮演本質上不同的角色。在一個受控設定中——單層Transformer必須記憶從主詞到共享屬性集合的隨機雙射——我們證明對數等級的嵌入維度就足以應對：主詞嵌入會編碼其對應屬性向量的線性疊加，而小型MLP並非作為聯想的鍵值映射，而是充當關係條件選擇器，透過ReLU門控提取相關屬性。我們將這些結果擴展至多跳設定——例如「x的妻子的母親是誰？」這類關係查詢鏈——提供有無思維鏈的建構方式，展現可證明的容量-深度權衡，並輔以匹配的資訊理論下界。實驗中，梯度下降確實會發現符合預測結構的解。訓練完成後，若適當重新初始化主詞嵌入，MLP能零樣本遷移至全新的雙射關係，揭示了該MLP學會的是一種通用的選擇機制，而非記憶任何特定事實集合。

English

How do transformer language models memorize factual associations? A common view casts internal weight matrices as associative memories over pairs of embeddings, requiring parameter counts that scale linearly with the number of facts. We develop a theoretical and empirical account of an alternative, geometric form of memorization in which learned embeddings encode relational structure directly, and the MLP plays a qualitatively different role. In a controlled setting where a single-layer transformer must memorize random bijections from subjects to a shared attribute set, we prove that a logarithmic embedding dimension suffices: subject embeddings encode linear superpositions of their associated attribute vectors, and a small MLP acts as a relation-conditioned selector that extracts the relevant attribute via ReLU gating, and not as an associative key-value mapping. We extend these results to the multi-hop setting -- chains of relational queries such as ``Who is the mother of the wife of x?'' -- providing constructions with and without chain-of-thought that exhibit a provable capacity-depth tradeoff, complemented by a matching information-theoretic lower bound. Empirically, gradient descent discovers solutions with precisely the predicted structure. Once trained, the MLP transfers zero-shot to entirely new bijections when subject embeddings are appropriately re-initialized, revealing that it has learned a generic selection mechanism rather than memorized any particular set of facts.

Transformer中的幾何事實召回

Geometric Factual Recall in Transformers

摘要

Support