ループを解く：言語と推論のためのアトラクタモデル

要旨

ループ型Transformerは、潜在表現を反復的に洗練することで、純粋なフィードフォワード計算に代わる有望な手法を提供し、言語モデリングと推論を向上させる。しかし、再帰的アーキテクチャは依然として訓練が不安定であり、最適化やデプロイにコストがかかり、小さく固定された再帰深度に制約される。本稿では、アトラクタモデルを導入する。このモデルでは、まずバックボーンモジュールが出力埋め込みを提案し、次にアトラクタモジュールが不動点を解くことでそれらを洗練し、勾配は暗黙的微分によって得られる。したがって、訓練時のメモリは実効深度に対して一定であり、反復回数は収束によって適応的に選択される。実証的に、アトラクタモデルは大規模言語モデルの事前学習と小規模モデルによる推論という2つの領域において既存モデルを上回る。言語モデリングにおいて、アトラクタモデルは標準的なTransformerおよび安定化ループ型モデルに対してサイズ横断的にパレート改善をもたらし、パープレキシティを最大46.6%、下流タスクの精度を最大19.7%向上させるとともに、訓練コストを削減する。特筆すべきは、770Mパラメータのアトラクタモデルが、2倍のトークンで訓練された1.3BパラメータのTransformerを上回る性能を示したことである。困難な推論タスクにおいて、わずか27Mパラメータと約1000サンプルで訓練された本モデルが、Sudoku-Extremeで91.4%、Maze-Hardで93.1%の精度を達成し、ClaudeやGPT o3といった最先端モデルが完全に失敗し、特殊化された再帰的理由推論器が大規模化で性能低下する中、好スケーリングを示す。最後に、アトラクタモデルが新規な現象、すなわち平衡内在化を示すことを明らかにする。不動点訓練によりモデルの初期出力埋め込みが平衡近傍に配置され、推論時にソルバを除去しても性能劣化がほとんど生じない。これらの結果は、アトラクタモデルが再帰をモデルが内在化を学習可能な計算へと変換することで、反復的洗練をスケーラブルにすることを示唆している。

English

Looped Transformers offer a promising alternative to purely feed-forward computation by iteratively refining latent representations, improving language modeling and reasoning. Yet recurrent architectures remain unstable to train, costly to optimize and deploy, and constrained to small, fixed recurrence depths. We introduce Attractor Models, in which a backbone module first proposes output embeddings, then an attractor module refines them by solving for the fixed point, with gradients obtained through implicit differentiation. Thus, training memory remains constant in effective depth, and iterations are chosen adaptively by convergence. Empirically, Attractor Models outperform existing models across two regimes, large-scale language-model pretraining and reasoning with tiny models. In language modeling, Attractor Models deliver a Pareto improvement over standard Transformers and stable looped models across sizes, improving perplexity by up to 46.6% and downstream accuracy by up to 19.7% while reducing training cost. Notably, a 770M Attractor Model outperforms a 1.3B Transformer trained on twice as many tokens. On challenging reasoning tasks, we show that our model with only 27M parameters and approximately 1000 examples achieves 91.4% accuracy on Sudoku-Extreme and 93.1% on Maze-Hard, scaling favorably where frontier models like Claude and GPT o3, fail completely, and specialized recursive reasoners collapse at larger sizes. Lastly, we show that Attractor Models exhibit a novel phenomenon, which we call equilibrium internalization: fixed-point training places the model's initial output embedding near equilibrium, allowing the solver to be removed at inference time with little degradation. Together, these results suggest that Attractor Models make iterative refinement scalable by turning recurrence into a computation the model can learn to internalize.

ループを解く：言語と推論のためのアトラクタモデル

Solve the Loop: Attractor Models for Language and Reasoning

要旨

Support