ChatPaper.aiChatPaper

语言模型数学推理的舍恩菲尔德解剖

Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

December 23, 2025
作者: Ming Li, Chenrui Fan, Yize Cheng, Soheil Feizi, Tianyi Zhou
cs.AI

摘要

大型语言模型日益展现出推理轨迹,然而其底层的认知结构与步骤仍难以超越表层统计进行识别和分析。我们采用舍恩菲尔德的片段理论作为归纳性的中观尺度视角,提出ThinkARM(模型推理解剖)框架——一种可扩展的方法,将推理轨迹显式抽象为功能性推理步骤(如分析、探索、实施、验证等)。在应用于不同模型的数学问题求解时,这种抽象揭示了可复现的思维动态以及推理模型与非推理模型之间的结构性差异,这些差异在词元级视角下并不明显。我们进一步通过两个诊断性案例研究表明:探索步骤是影响正确率的关键分支节点,而效率导向的方法会选择性地抑制评估反馈步骤而非均匀缩短响应。这些结果共同证明,片段级表征能使推理步骤显性化,从而系统分析现代语言模型中推理结构的形成、稳定与演变机制。
English
Large language models increasingly expose reasoning traces, yet their underlying cognitive structure and steps remain difficult to identify and analyze beyond surface-level statistics. We adopt Schoenfeld's Episode Theory as an inductive, intermediate-scale lens and introduce ThinkARM (Anatomy of Reasoning in Models), a scalable framework that explicitly abstracts reasoning traces into functional reasoning steps such as Analysis, Explore, Implement, Verify, etc. When applied to mathematical problem solving by diverse models, this abstraction reveals reproducible thinking dynamics and structural differences between reasoning and non-reasoning models, which are not apparent from token-level views. We further present two diagnostic case studies showing that exploration functions as a critical branching step associated with correctness, and that efficiency-oriented methods selectively suppress evaluative feedback steps rather than uniformly shortening responses. Together, our results demonstrate that episode-level representations make reasoning steps explicit, enabling systematic analysis of how reasoning is structured, stabilized, and altered in modern language models.
PDF43December 27, 2025