通过误差控制动力学重新思考循环模型中的状态追踪

摘要

递归架构中的状态追踪理论主要聚焦于表达能力：即固定架构能否在理论上实现一组符号转换规则。我们认为同样重要的是误差控制，即控制隐藏状态沿区分符号状态的方向漂移的动态机制。我们证明，仿射递归网络（一类包含状态空间模型和线性注意力的模型）一旦保持状态表示，就无法沿状态分离子空间纠正误差。因此，实际的仿射追踪器并未学习到稳健的状态追踪，而是学习由累积的状态相关误差所支配的有限时域解。我们刻画了这种失效的机制，表明仅当累积的类内散布相对于初始类间分离度保持较小时，追踪才具有可读性。我们在群组状态追踪任务上通过实验证明，这种崩溃是可预测的：当可区分性比率超过训练解码器的可读阈值时，追踪就会失效。在训练过的模型中，这一交叉点可预测下游准确性失效的时域界限。这些结果确立了稳健的状态追踪不仅取决于架构的理论表达能力，更关键地取决于其误差控制。

English

The theory of state tracking in recurrent architectures has predominantly focused on expressive capacity: whether a fixed architecture can theoretically realize a set of symbolic transition rules. We argue that equally important is error control, the dynamics governing hidden-state drift along the directions that distinguish symbolic states. We prove that affine recurrent networks, a class of models encompassing State-Space Models and Linear Attention, cannot correct errors along state-separating subspaces once they preserve state representations. Consequently, practical affine trackers do not learn robust state tracking; rather, they learn finite horizon solutions governed by accumulated state-relevant error. We characterize the mechanics of this failure, showing that tracking remains readable only while the accumulating within-class spread remains small relative to the initial between-class separation. We demonstrate empirically on group state-tracking tasks that this breakdown is predictable: tracking collapses when the distinguishability ratio crosses the readability threshold of the trained decoder. Across trained models, the point of this crossing predicts the horizon at which downstream accuracy fails. These results establish that robust state tracking is determined not only by an architecture's theoretical expressivity but crucially by its error control.

通过误差控制动力学重新思考循环模型中的状态追踪

Rethinking State Tracking in Recurrent Models Through Error Control Dynamics

摘要

Support