变换器模型隐式演绎推理的缩放特性

摘要

我们系统研究了深度有界Transformer模型在霍恩子句隐式演绎推理中的扩展特性。通过系统性地将可证明性与伪相关特征解耦并强化算法对齐机制，发现采用双向前缀掩码的足够深层模型中，隐式推理在不同图拓扑结构和问题宽度下均能逼近显式思维链的性能表现，但深度外推任务仍需依赖思维链机制。

English

We investigate the scaling properties of implicit deductive reasoning over Horn clauses in depth-bounded Transformers. By systematically decorrelating provability from spurious features and enforcing algorithmic alignment, we find that in sufficiently deep models with a bidirectional prefix mask, implicit reasoning approaches explicit CoT performance across graph topologies and problem widths, though CoT remains necessary for depth extrapolation.

变换器模型隐式演绎推理的缩放特性

The Scaling Properties of Implicit Deductive Reasoning in Transformers

摘要

Support