트랜스포머에서의 기하학적 사실 회상

초록

트랜스포머 언어 모델은 어떻게 사실적 연관성을 기억하는가? 일반적인 관점은 내부 가중치 행렬을 임베딩 쌍에 대한 연상 메모리로 간주하며, 이는 사실 개수에 선형적으로 비례하는 매개변수 수를 요구한다. 우리는 학습된 임베딩이 관계 구조를 직접 인코딩하고 MLP가 질적으로 다른 역할을 수행하는 대안적인 기하학적 기억 형태에 대한 이론적 및 경험적 설명을 제시한다. 단일 층 트랜스포머가 주체에서 공유된 속성 집합으로의 무작위 전단사를 기억해야 하는 통제된 환경에서, 로그 임베딩 차원으로 충분함을 증명한다. 주체 임베딩은 자신의 연관된 속성 벡터들의 선형 중첩을 인코딩하며, 작은 MLP는 연상 키-값 매핑이 아니라 ReLU 게이팅을 통해 관련 속성을 추출하는 관계 조건부 선택기 역할을 한다. 이 결과를 다중 홉 환경, 즉 "x의 아내의 어머니는 누구인가?"와 같은 관계형 쿼리 체인으로 확장하여, 사고 사슬이 있거나 없는 구성에서 증명 가능한 용량-깊이 트레이드오프를 제공하고, 이에 상응하는 정보 이론적 하한으로 보완한다. 경험적으로, 경사 하강법은 정확히 예측된 구조를 가진 해를 발견한다. 학습 후, MLP는 주체 임베딩이 적절히 재초기화될 때 완전히 새로운 전단사로 제로샷 전이되며, 이는 MLP가 특정 사실 집합을 기억한 것이 아니라 일반적인 선택 메커니즘을 학습했음을 드러낸다.

English

How do transformer language models memorize factual associations? A common view casts internal weight matrices as associative memories over pairs of embeddings, requiring parameter counts that scale linearly with the number of facts. We develop a theoretical and empirical account of an alternative, geometric form of memorization in which learned embeddings encode relational structure directly, and the MLP plays a qualitatively different role. In a controlled setting where a single-layer transformer must memorize random bijections from subjects to a shared attribute set, we prove that a logarithmic embedding dimension suffices: subject embeddings encode linear superpositions of their associated attribute vectors, and a small MLP acts as a relation-conditioned selector that extracts the relevant attribute via ReLU gating, and not as an associative key-value mapping. We extend these results to the multi-hop setting -- chains of relational queries such as ``Who is the mother of the wife of x?'' -- providing constructions with and without chain-of-thought that exhibit a provable capacity-depth tradeoff, complemented by a matching information-theoretic lower bound. Empirically, gradient descent discovers solutions with precisely the predicted structure. Once trained, the MLP transfers zero-shot to entirely new bijections when subject embeddings are appropriately re-initialized, revealing that it has learned a generic selection mechanism rather than memorized any particular set of facts.

트랜스포머에서의 기하학적 사실 회상

Geometric Factual Recall in Transformers

초록

Support