超越记忆：通过递归、记忆与测试时计算扩展提升推理深度

摘要

推理是大语言模型的核心能力，然而理解它们如何学习并执行多步推理仍是一个未解之谜。在本研究中，我们探讨了不同架构和训练方法如何影响模型在细胞自动机框架下的多步推理能力。通过使用随机布尔函数生成的、针对随机初始条件的状态序列进行训练，以排除记忆效应，我们证明了大多数神经网络架构能够抽象出底层规则。尽管模型在下一状态预测上达到了高准确率，但若需进行多步推理，其性能则急剧下降。我们证实，增加模型深度对于序列计算起着至关重要的作用。我们进一步展示，通过引入循环、记忆机制以及测试时计算扩展来有效增加模型深度，能显著提升其推理能力。

English

Reasoning is a core capability of large language models, yet understanding how they learn and perform multi-step reasoning remains an open problem. In this study, we explore how different architectures and training methods affect model multi-step reasoning capabilities within a cellular automata framework. By training on state sequences generated with random Boolean functions for random initial conditions to exclude memorization, we demonstrate that most neural architectures learn to abstract the underlying rules. While models achieve high accuracy in next-state prediction, their performance declines sharply if multi-step reasoning is required. We confirm that increasing model depth plays a crucial role for sequential computations. We demonstrate that an extension of the effective model depth with recurrence, memory, and test-time compute scaling substantially enhances reasoning capabilities.

超越记忆：通过递归、记忆与测试时计算扩展提升推理深度

Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling

摘要

Support