루프 풀기: 언어와 추론을 위한 어트랙터 모델

초록

루프드 트랜스포머(Looped Transformers)는 순전히 피드포워드 계산에 대한 유망한 대안으로, 잠재 표현을 반복적으로 정제하여 언어 모델링과 추론을 개선한다. 그러나 순환 아키텍처는 훈련이 불안정하고, 최적화 및 배포 비용이 높으며, 작고 고정된 재귀 깊이로 제한된다. 본 연구에서는 어트랙터 모델(Attractor Models)을 도입한다. 이 모델에서 백본 모듈이 먼저 출력 임베딩을 제안하고, 어트랙터 모듈이 고정점을 풀어 이를 정제하며, 기울기는 암시적 미분을 통해 얻어진다. 따라서 훈련 메모리는 유효 깊이에 관계없이 일정하게 유지되며, 반복 횟수는 수렴에 따라 적응적으로 선택된다. 실험적으로 어트랙터 모델은 대규모 언어 모델 사전 훈련과 소형 모델을 사용한 추론이라는 두 영역에서 기존 모델을 능가한다. 언어 모델링에서 어트랙터 모델은 표준 트랜스포머와 안정적인 루프드 모델에 비해 모든 규모에서 파레토 개선을 제공하며, 혼란도를 최대 46.6%, 하위 작업 정확도를 최대 19.7% 향상시키면서 훈련 비용을 절감한다. 특히, 770M 어트랙터 모델은 두 배 많은 토큰으로 훈련된 1.3B 트랜스포머보다 우수한 성능을 보인다. 까다로운 추론 작업에서, 우리의 모델은 단 2700만 개의 매개변수와 약 1000개의 예제로 Sudoku-Extreme에서 91.4%, Maze-Hard에서 93.1%의 정확도를 달성하며, Claude 및 GPT o3와 같은 최첨단 모델이 완전히 실패하고 특화된 재귀 추론기가 더 큰 규모에서 붕괴하는 상황에서도 유리하게 확장된다. 마지막으로, 어트랙터 모델이 평형 내재화(equilibrium internalization)라는 새로운 현상을 보임을 입증한다: 고정점 훈련은 모델의 초기 출력 임베딩을 평형에 가깝게 배치하여, 추론 시에 솔버를 제거해도 성능 저하가 거의 발생하지 않도록 한다. 이러한 결과들은 어트랙터 모델이 재귀를 모델이 내재화할 수 있는 계산으로 전환함으로써 반복적 정제를 확장 가능하게 만든다는 점을 시사한다.

English

Looped Transformers offer a promising alternative to purely feed-forward computation by iteratively refining latent representations, improving language modeling and reasoning. Yet recurrent architectures remain unstable to train, costly to optimize and deploy, and constrained to small, fixed recurrence depths. We introduce Attractor Models, in which a backbone module first proposes output embeddings, then an attractor module refines them by solving for the fixed point, with gradients obtained through implicit differentiation. Thus, training memory remains constant in effective depth, and iterations are chosen adaptively by convergence. Empirically, Attractor Models outperform existing models across two regimes, large-scale language-model pretraining and reasoning with tiny models. In language modeling, Attractor Models deliver a Pareto improvement over standard Transformers and stable looped models across sizes, improving perplexity by up to 46.6% and downstream accuracy by up to 19.7% while reducing training cost. Notably, a 770M Attractor Model outperforms a 1.3B Transformer trained on twice as many tokens. On challenging reasoning tasks, we show that our model with only 27M parameters and approximately 1000 examples achieves 91.4% accuracy on Sudoku-Extreme and 93.1% on Maze-Hard, scaling favorably where frontier models like Claude and GPT o3, fail completely, and specialized recursive reasoners collapse at larger sizes. Lastly, we show that Attractor Models exhibit a novel phenomenon, which we call equilibrium internalization: fixed-point training places the model's initial output embedding near equilibrium, allowing the solver to be removed at inference time with little degradation. Together, these results suggest that Attractor Models make iterative refinement scalable by turning recurrence into a computation the model can learn to internalize.

루프 풀기: 언어와 추론을 위한 어트랙터 모델

Solve the Loop: Attractor Models for Language and Reasoning

초록

Support