解決迴圈:語言與推理的吸引子模型
Solve the Loop: Attractor Models for Language and Reasoning
May 12, 2026
作者: Jacob Fein-Ashley, Paria Rashidinejad
cs.AI
摘要
循环式Transformer通过迭代精炼潜在表征,为纯前馈计算提供了一种有前景的替代方案,能有效提升语言建模与推理能力。然而,循环架构在训练时仍不稳定,优化与部署成本高昂,且受限于固定的小规模循环深度。我们提出吸引子模型(Attractor Models),其中骨干模块首先生成输出嵌入,随后吸引子模块通过求解不动点来精炼这些嵌入,并通过隐式微分法获取梯度。因此,训练内存消耗在有效深度上保持恒定,迭代次数则依据收敛性自适应选择。实验表明,吸引子模型在大规模语言模型预训练与小型模型推理这两个场景中均优于现有模型。在语言建模方面,吸引子模型在不同规模上实现了对标准Transformer及稳定循环式模型的帕累托改进,困惑度最高降低46.6%,下游任务准确率最高提升19.7%,同时降低了训练成本。值得注意的是,一个7.7亿参数的吸引子模型性能超越了基于两倍数据训练的13亿参数Transformer。在具有挑战性的推理任务中,我们展示了仅2700万参数、约1000个训练样本的模型在数独-极限(Sudoku-Extreme)任务上达到91.4%准确率,在迷宫-困难(Maze-Hard)任务上达到93.1%准确率,其规模扩展性能优异,而Claude与GPT o3等前沿模型完全失败,专用递归推理器在更大规模时失效。最后,我们发现吸引子模型展现出一种新现象,我们称之为均衡内化(equilibrium internalization):不动点训练使模型的初始输出嵌入接近均衡,从而允许在推理时移除求解器而性能几乎不受损失。综合这些结果,吸引子模型通过将循环转化为模型可学习并内化的计算,使迭代精炼具备可扩展性。
English
Looped Transformers offer a promising alternative to purely feed-forward computation by iteratively refining latent representations, improving language modeling and reasoning. Yet recurrent architectures remain unstable to train, costly to optimize and deploy, and constrained to small, fixed recurrence depths. We introduce Attractor Models, in which a backbone module first proposes output embeddings, then an attractor module refines them by solving for the fixed point, with gradients obtained through implicit differentiation. Thus, training memory remains constant in effective depth, and iterations are chosen adaptively by convergence. Empirically, Attractor Models outperform existing models across two regimes, large-scale language-model pretraining and reasoning with tiny models. In language modeling, Attractor Models deliver a Pareto improvement over standard Transformers and stable looped models across sizes, improving perplexity by up to 46.6% and downstream accuracy by up to 19.7% while reducing training cost. Notably, a 770M Attractor Model outperforms a 1.3B Transformer trained on twice as many tokens. On challenging reasoning tasks, we show that our model with only 27M parameters and approximately 1000 examples achieves 91.4% accuracy on Sudoku-Extreme and 93.1% on Maze-Hard, scaling favorably where frontier models like Claude and GPT o3, fail completely, and specialized recursive reasoners collapse at larger sizes. Lastly, we show that Attractor Models exhibit a novel phenomenon, which we call equilibrium internalization: fixed-point training places the model's initial output embedding near equilibrium, allowing the solver to be removed at inference time with little degradation. Together, these results suggest that Attractor Models make iterative refinement scalable by turning recurrence into a computation the model can learn to internalize.