超越記憶:透過遞歸、記憶與測試時計算擴展提升推理深度
Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling
August 22, 2025
作者: Ivan Rodkin, Daniil Orel, Konstantin Smirnov, Arman Bolatov, Bilal Elbouardi, Besher Hassan, Yuri Kuratov, Aydar Bulatov, Preslav Nakov, Timothy Baldwin, Artem Shelmanov, Mikhail Burtsev
cs.AI
摘要
推理是大型語言模型的核心能力,然而理解它們如何學習並執行多步驟推理仍是一個未解之謎。在本研究中,我們探討了不同架構和訓練方法如何影響模型在細胞自動機框架內的多步驟推理能力。通過使用隨機布爾函數生成的狀態序列進行訓練,並基於隨機初始條件以排除記憶化效應,我們證明了大多數神經架構能夠抽象出底層規則。雖然模型在下一狀態預測上達到了高準確率,但如果需要進行多步驟推理,其性能則急劇下降。我們確認了增加模型深度對於序列計算至關重要。我們展示了通過遞歸、記憶和測試時計算擴展來有效增加模型深度,能顯著提升推理能力。
English
Reasoning is a core capability of large language models, yet understanding
how they learn and perform multi-step reasoning remains an open problem. In
this study, we explore how different architectures and training methods affect
model multi-step reasoning capabilities within a cellular automata framework.
By training on state sequences generated with random Boolean functions for
random initial conditions to exclude memorization, we demonstrate that most
neural architectures learn to abstract the underlying rules. While models
achieve high accuracy in next-state prediction, their performance declines
sharply if multi-step reasoning is required. We confirm that increasing model
depth plays a crucial role for sequential computations. We demonstrate that an
extension of the effective model depth with recurrence, memory, and test-time
compute scaling substantially enhances reasoning capabilities.