通用推理模型
Universal Reasoning Model
December 16, 2025
作者: Zitian Gao, Lynx Chen, Yihao Xiao, He Xing, Ran Tao, Haoming Luo, Joey Zhou, Bryan Dai
cs.AI
摘要
通用变换器(UT)已在ARC-AGI和数独等复杂推理任务中得到广泛应用,但其性能提升的具体来源仍未得到充分探索。本研究系统分析了UT的变体,发现ARC-AGI的性能改进主要源于Transformer的循环归纳偏置和强非线性组件,而非复杂的架构设计。基于此发现,我们提出通用推理模型(URM),通过引入短程卷积和截断反向传播来增强UT。该方法显著提升了推理性能,在ARC-AGI 1上达到53.8%的pass@1最高水平,在ARC-AGI 2上实现16.0%的pass@1指标。代码已开源:https://github.com/zitian-gao/URM。
English
Universal transformers (UTs) have been widely used for complex reasoning tasks such as ARC-AGI and Sudoku, yet the specific sources of their performance gains remain underexplored. In this work, we systematically analyze UTs variants and show that improvements on ARC-AGI primarily arise from the recurrent inductive bias and strong nonlinear components of Transformer, rather than from elaborate architectural designs. Motivated by this finding, we propose the Universal Reasoning Model (URM), which enhances the UT with short convolution and truncated backpropagation. Our approach substantially improves reasoning performance, achieving state-of-the-art 53.8% pass@1 on ARC-AGI 1 and 16.0% pass@1 on ARC-AGI 2. Our code is avaliable at https://github.com/zitian-gao/URM.