大規模言語モデルにおける大規模計数の機構的解明：システム2戦略を通じて

要旨

大規模言語モデル（LLM）は、複雑な数学的問題において高い性能を示す一方で、計数タスクには体系的な限界が存在する。この問題はトランスフォーマーのアーキテクチャ上の制約に起因しており、計数処理が複数の層に跨って行われるため、深さの制約から大規模な計数問題では精度が低下する。この限界を克服するため、我々はSystem-2認知プロセスに着想を得た推論時戦略を提案する。これは大規模な計数タスクを、モデルが確実に解決可能な独立した小問題に分解する手法である。本アプローチの評価には、観察的および因果的媒介分析を用い、このSystem-2的戦略の背後にあるメカニズムの解明を試みた。機構分析の結果、以下の主要要素が特定された：潜在的なカウント値は各部分の最終項目表現で計算・保存され、専用のアテンションヘッドを介して中間段階へ転送され、最終段階で集約されて総数が算出される。実験結果から、本戦略によりLLMがアーキテクチャ上の限界を超え、大規模計数タスクで高い精度を達成できることが実証された。本研究はLLMにおけるSystem-2的計数の機序解明に貢献するとともに、推論行動の改善と理解に向けた一般化可能なアプローチを提示するものである。

English

Large language models (LLMs), despite strong performance on complex mathematical problems, exhibit systematic limitations in counting tasks. This issue arises from architectural limits of transformers, where counting is performed across layers, leading to degraded precision for larger counting problems due to depth constraints. To address this limitation, we propose a simple test-time strategy inspired by System-2 cognitive processes that decomposes large counting tasks into smaller, independent sub-problems that the model can reliably solve. We evaluate this approach using observational and causal mediation analyses to understand the underlying mechanism of this System-2-like strategy. Our mechanistic analysis identifies key components: latent counts are computed and stored in the final item representations of each part, transferred to intermediate steps via dedicated attention heads, and aggregated in the final stage to produce the total count. Experimental results demonstrate that this strategy enables LLMs to surpass architectural limitations and achieve high accuracy on large-scale counting tasks. This work provides mechanistic insight into System-2 counting in LLMs and presents a generalizable approach for improving and understanding their reasoning behavior.

大規模言語モデルにおける大規模計数の機構的解明：システム2戦略を通じて

Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy

要旨

Support