Herziening van toestandsvolging in recurrente modellen via foutcontrole-dynamica

Samenvatting

De theorie van toestandtracking in recursieve architecturen heeft zich voornamelijk gericht op de expressieve capaciteit: of een vaste architectuur in theorie een verzameling symbolische overgangsregels kan realiseren. Wij stellen dat foutcontrole even belangrijk is, de dynamiek die de drift van verborgen toestanden regelt langs de richtingen die symbolische toestanden onderscheiden. We bewijzen dat affiene recursieve netwerken, een klasse van modellen die toestandsruimtemodellen en lineaire aandacht omvat, fouten langs toestandsscheidende deelruimten niet kunnen corrigeren zodra ze toestandsrepresentaties behouden. Bijgevolg leren praktische affiene trackers geen robuuste toestandtracking; in plaats daarvan leren ze eindige horizonoplossingen die worden bepaald door opgehoopte toestandsrelevante fout. We karakteriseren de mechanica van dit falen en tonen aan dat tracking alleen leesbaar blijft zolang de oplopende binnen-klasse spreiding klein blijft ten opzichte van de initiële tussen-klasse scheiding. We tonen empirisch aan bij groepstoestandtrackingstaken dat deze ineenstorting voorspelbaar is: tracking stort in wanneer de onderscheidbaarheidsratio de leesbaarheidsdrempel van de getrainde decoder overschrijdt. Bij getrainde modellen voorspelt het punt van deze overschrijding de horizon waarop de stroomafwaartse nauwkeurigheid faalt. Deze resultaten bevestigen dat robuuste toestandtracking niet alleen wordt bepaald door de theoretische expressiviteit van een architectuur, maar ook cruciaal door de foutcontrole.

English

The theory of state tracking in recurrent architectures has predominantly focused on expressive capacity: whether a fixed architecture can theoretically realize a set of symbolic transition rules. We argue that equally important is error control, the dynamics governing hidden-state drift along the directions that distinguish symbolic states. We prove that affine recurrent networks, a class of models encompassing State-Space Models and Linear Attention, cannot correct errors along state-separating subspaces once they preserve state representations. Consequently, practical affine trackers do not learn robust state tracking; rather, they learn finite horizon solutions governed by accumulated state-relevant error. We characterize the mechanics of this failure, showing that tracking remains readable only while the accumulating within-class spread remains small relative to the initial between-class separation. We demonstrate empirically on group state-tracking tasks that this breakdown is predictable: tracking collapses when the distinguishability ratio crosses the readability threshold of the trained decoder. Across trained models, the point of this crossing predicts the horizon at which downstream accuracy fails. These results establish that robust state tracking is determined not only by an architecture's theoretical expressivity but crucially by its error control.