语言模型是算术中的符号学习者。
Language Models are Symbolic Learners in Arithmetic
October 21, 2024
作者: Chunyuan Deng, Zhiqi Li, Roy Xie, Ruidi Chang, Hanjie Chen
cs.AI
摘要
大型语言模型(LLMs)被认为在算术学习方面存在困难,这是由于语言建模和数字计算之间固有的差异,但缺乏具体证据。本研究通过一个双侧实验回应了这一说法。我们首先调查LLMs在算术学习过程中是否利用了部分乘积。我们发现,虽然LLMs在学习后能够识别一些部分乘积,但却未能利用它们进行算术任务。然后,我们探讨了LLMs如何通过将任务分解为子组来符号化地处理算术问题,假设困难源于子组复杂性和选择。我们的结果显示,当子组复杂性固定时,LLMs会类似地处理一系列不同的算术操作。通过分析不同训练规模下的位置级准确性,我们进一步观察到其呈现U形模式:LLMs会迅速学习第一个和最后一个位置上的最简单模式,然后逐渐学习中间位置上更困难的模式。这表明LLMs在学习过程中按照易到难的范式选择子组。我们的研究证实了LLMs在算术任务中是纯符号学习者,并强调通过子组水平量化深入理解它们的重要性。
English
Large Language Models (LLMs) are thought to struggle with arithmetic learning
due to the inherent differences between language modeling and numerical
computation, but concrete evidence has been lacking. This work responds to this
claim through a two-side experiment. We first investigate whether LLMs leverage
partial products during arithmetic learning. We find that although LLMs can
identify some partial products after learning, they fail to leverage them for
arithmetic tasks, conversely. We then explore how LLMs approach arithmetic
symbolically by breaking tasks into subgroups, hypothesizing that difficulties
arise from subgroup complexity and selection. Our results show that when
subgroup complexity is fixed, LLMs treat a collection of different arithmetic
operations similarly. By analyzing position-level accuracy across different
training sizes, we further observe that it follows a U-shaped pattern: LLMs
quickly learn the easiest patterns at the first and last positions, while
progressively learning the more difficult patterns in the middle positions.
This suggests that LLMs select subgroup following an easy-to-hard paradigm
during learning. Our work confirms that LLMs are pure symbolic learners in
arithmetic tasks and underscores the importance of understanding them deeply
through subgroup-level quantification.Summary
AI-Generated Summary