少即是多：微缩网络中的递归推理

摘要

层次推理模型（HRM）是一种创新方法，它利用两个小型神经网络以不同频率进行递归运算。这一受生物学启发的技术，在诸如数独、迷宫及ARC-AGI等复杂谜题任务上，超越了大型语言模型（LLMs）的表现，而其所依赖的仅是小型模型（2700万参数）和少量数据（约1000个示例）。HRM在利用小型网络解决难题方面展现出巨大潜力，但其机制尚未被充分理解，且可能存在优化空间。我们提出了微型递归模型（TRM），这是一种更为简洁的递归推理方法，仅采用一个仅含两层的微型网络，便实现了比HRM更高的泛化能力。TRM仅需700万参数，在ARC-AGI-1和ARC-AGI-2测试集上分别取得了45%和8%的准确率，这一成绩超越了多数LLMs（如Deepseek R1、o3-mini、Gemini 2.5 Pro），而其参数数量却不足这些模型的0.01%。

English

Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on hard puzzle tasks such as Sudoku, Maze, and ARC-AGI while trained with small models (27M parameters) on small data (around 1000 examples). HRM holds great promise for solving hard problems with small networks, but it is not yet well understood and may be suboptimal. We propose Tiny Recursive Model (TRM), a much simpler recursive reasoning approach that achieves significantly higher generalization than HRM, while using a single tiny network with only 2 layers. With only 7M parameters, TRM obtains 45% test-accuracy on ARC-AGI-1 and 8% on ARC-AGI-2, higher than most LLMs (e.g., Deepseek R1, o3-mini, Gemini 2.5 Pro) with less than 0.01% of the parameters.

少即是多：微缩网络中的递归推理

Less is More: Recursive Reasoning with Tiny Networks

摘要

Support