少即是多：利用微型网络进行递归推理

摘要

层次推理模型（HRM）是一种创新方法，它采用两个小型神经网络以不同频率进行递归。这种受生物学启发的技术在解决数独、迷宫和ARC-AGI等复杂谜题任务时，表现优于大型语言模型（LLMs），而训练仅需小型模型（2700万参数）和少量数据（约1000个示例）。HRM在利用小型网络解决难题方面展现出巨大潜力，但其机制尚未被充分理解，可能并非最优。我们提出微型递归模型（TRM），这是一种更为简洁的递归推理方法，仅使用一个仅含2层的微型网络，却实现了比HRM显著更高的泛化能力。TRM仅需700万参数，在ARC-AGI-1上获得45%的测试准确率，在ARC-AGI-2上达到8%，超越了大多数LLMs（如Deepseek R1、o3-mini、Gemini 2.5 Pro），而参数数量不足其0.01%。

English

Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on hard puzzle tasks such as Sudoku, Maze, and ARC-AGI while trained with small models (27M parameters) on small data (around 1000 examples). HRM holds great promise for solving hard problems with small networks, but it is not yet well understood and may be suboptimal. We propose Tiny Recursive Model (TRM), a much simpler recursive reasoning approach that achieves significantly higher generalization than HRM, while using a single tiny network with only 2 layers. With only 7M parameters, TRM obtains 45% test-accuracy on ARC-AGI-1 and 8% on ARC-AGI-2, higher than most LLMs (e.g., Deepseek R1, o3-mini, Gemini 2.5 Pro) with less than 0.01% of the parameters.

少即是多：利用微型网络进行递归推理

Less is More: Recursive Reasoning with Tiny Networks

摘要

Support