加法の形状：大規模言語モデルにおける算術の幾何学的構造

要旨

大規模言語モデルは基本演算において逆説的な脆弱性を示し、内部計算と離散的な出力との間の乖離を示唆する。複数オペランド加算中の残余ストリーム幾何構造を解析することで、我々は等生和軌道（Iso-Raw-Sum Trajectory, IRST）を同定した。これは、表現が意味数字によってアンカーされ、連続的な繰り上がりファイバーによって変調される幾何構造である。我々はこの幾何構造を説明するためにノイズ量子化モデルを提案し、算術誤差を幾何学的滑りとして捉える。これは内部ニューラルノイズが連続的な潜在繰り上がりポテンシャルを量子化閾値越えさせることに起因する。この幾何学的枠組みはさらに、単一の活性化ベクトルから共存する潜在信号（例えば真値と幻覚）を軽量プローブがどのように分離できるかというプローブ汎用性を解明する。最後に、これらの知見を推論中に量子化障害を効果的に検出・修正する幾何学的整合性チェック手法によって検証する。コードは https://github.com/RL-MIND/Shape-of-Addition で公開している。

English

Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discrete output. By analyzing the residual stream geometry during multi-operand addition, we identify the Iso-Raw-Sum Trajectory (IRST), a geometric structure where representations are anchored by semantic digits and modulated by continuous carry fibers. We propose the Noisy Quantization Model to explain this geometry, framing arithmetic errors as Geometric Slippages caused by internal neural noise pushing a continuous, latent Carry Potential across quantization thresholds. This geometric framework further elucidates Probe Versatility, explaining how lightweight probes can disentangle coexisting latent signals (such as ground truth versus hallucination) from a single activation vector. Finally, we validate these insights through a geometric consistency check method that effectively detects and corrects these quantization failures during inference. Our code is available at https://github.com/RL-MIND/Shape-of-Addition.