De Vorm van Optellen: Geometrische Structuren van Rekenen in Grote Taalmodellen

Samenvatting

Grote taalmodellen vertonen een paradoxale kwetsbaarheid in fundamentele rekenkunde, wat wijst op een disconnectie tussen interne berekening en discrete uitvoer. Door de residustroomgeometrie tijdens meeroperandoptelling te analyseren, identificeren we het Iso-Raw-Sum Traject (IRST), een geometrische structuur waarbij representaties worden verankerd door semantische cijfers en gemoduleerd door continue overdrachtvezels. We stellen het Noisy Quantization Model voor om deze geometrie te verklaren, waarbij rekenfouten worden beschouwd als geometrische verschuivingen veroorzaakt door interne neurale ruis die een continu, latent overdrachtspotentiaal over kwantiseringsdrempels duwt. Dit geometrische raamwerk verheldert verder de veelzijdigheid van probes, door uit te leggen hoe lichtgewicht probes naast elkaar bestaande latente signalen (zoals grondwaarheid versus hallucinatie) kunnen ontwarren uit een enkele activeringsvector. Tenslotte valideren we deze inzichten via een geometrische consistentiecontrolemethode die deze kwantiseringsfouten tijdens inferentie effectief detecteert en corrigeert. Onze code is beschikbaar op https://github.com/RL-MIND/Shape-of-Addition.

English

Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discrete output. By analyzing the residual stream geometry during multi-operand addition, we identify the Iso-Raw-Sum Trajectory (IRST), a geometric structure where representations are anchored by semantic digits and modulated by continuous carry fibers. We propose the Noisy Quantization Model to explain this geometry, framing arithmetic errors as Geometric Slippages caused by internal neural noise pushing a continuous, latent Carry Potential across quantization thresholds. This geometric framework further elucidates Probe Versatility, explaining how lightweight probes can disentangle coexisting latent signals (such as ground truth versus hallucination) from a single activation vector. Finally, we validate these insights through a geometric consistency check method that effectively detects and corrects these quantization failures during inference. Our code is available at https://github.com/RL-MIND/Shape-of-Addition.