通过递归潜在空间推理解锁Transformer模型在分布外场景下的泛化能力

摘要

系统化、组合式的泛化能力超越训练数据分布，依然是机器学习领域的一项核心挑战——也是现代语言模型推理能力发展的关键瓶颈。本研究以基于计算图的GSM8K风格模块化算术任务为测试平台，探讨了Transformer网络在分布外（OOD）泛化上的表现。我们提出并探索了四种旨在增强OOD泛化的架构机制：(i) 输入自适应循环；(ii) 算法监督；(iii) 通过离散瓶颈实现的锚定潜在表示；以及(iv) 显式纠错机制。这些机制共同构成了一种架构方法，使Transformer网络能够进行原生且可扩展的潜在空间推理，具备强大的算法泛化能力。我们辅以详尽的机制解释性分析，揭示了这些机制如何促成稳健的OOD泛化能力。

English

Systematic, compositional generalization beyond the training distribution remains a core challenge in machine learning -- and a critical bottleneck for the emergent reasoning abilities of modern language models. This work investigates out-of-distribution (OOD) generalization in Transformer networks using a GSM8K-style modular arithmetic on computational graphs task as a testbed. We introduce and explore a set of four architectural mechanisms aimed at enhancing OOD generalization: (i) input-adaptive recurrence; (ii) algorithmic supervision; (iii) anchored latent representations via a discrete bottleneck; and (iv) an explicit error-correction mechanism. Collectively, these mechanisms yield an architectural approach for native and scalable latent space reasoning in Transformer networks with robust algorithmic generalization capabilities. We complement these empirical results with a detailed mechanistic interpretability analysis that reveals how these mechanisms give rise to robust OOD generalization abilities.

通过递归潜在空间推理解锁Transformer模型在分布外场景下的泛化能力

Unlocking Out-of-Distribution Generalization in Transformers via Recursive Latent Space Reasoning

摘要

Support