오류 제어 동역학을 통한 순환 모델의 상태 추적 재고찰

초록

순환 아키텍처에서 상태 추적 이론은 주로 표현 능력, 즉 고정된 아키텍처가 일련의 기호 전이 규칙을 이론적으로 실현할 수 있는지 여부에 초점을 맞춰왔다. 우리는 이와 동등하게 중요한 것이 오류 제어, 즉 숨겨진 상태가 기호 상태를 구분하는 방향을 따라 표류하는 동역학이라고 주장한다. 우리는 상태 공간 모델과 선형 주의를 포함하는 모델 클래스인 아핀 순환 네트워크가 상태 표현을 보존하는 한 상태 분리 부분공간을 따라 오류를 교정할 수 없음을 증명한다. 결과적으로, 실제 아핀 추적기는 강건한 상태 추적을 학습하지 못하며, 대신 누적된 상태 관련 오류에 의해 지배되는 유한 시간 구간 해법을 학습한다. 우리는 이 실패의 메커니즘을 특성화하며, 추적은 축적되는 클래스 내 분산이 초기 클래스 간 분리에 비해 작게 유지되는 동안에만 읽을 수 있음을 보여준다. 우리는 그룹 상태 추적 작업에 대한 경험적 실험을 통해 이러한 붕괴가 예측 가능함을 입증한다. 즉, 구별 가능성 비율이 훈련된 디코더의 가독성 임계값을 초과할 때 추적이 붕괴된다. 훈련된 모델 전반에 걸쳐, 이 교차 지점은 하류 정확도가 실패하는 시간 구간을 예측한다. 이 결과들은 강건한 상태 추적이 아키텍처의 이론적 표현력뿐만 아니라 결정적으로 오류 제어에 의해 결정된다는 것을 확립한다.

English

The theory of state tracking in recurrent architectures has predominantly focused on expressive capacity: whether a fixed architecture can theoretically realize a set of symbolic transition rules. We argue that equally important is error control, the dynamics governing hidden-state drift along the directions that distinguish symbolic states. We prove that affine recurrent networks, a class of models encompassing State-Space Models and Linear Attention, cannot correct errors along state-separating subspaces once they preserve state representations. Consequently, practical affine trackers do not learn robust state tracking; rather, they learn finite horizon solutions governed by accumulated state-relevant error. We characterize the mechanics of this failure, showing that tracking remains readable only while the accumulating within-class spread remains small relative to the initial between-class separation. We demonstrate empirically on group state-tracking tasks that this breakdown is predictable: tracking collapses when the distinguishability ratio crosses the readability threshold of the trained decoder. Across trained models, the point of this crossing predicts the horizon at which downstream accuracy fails. These results establish that robust state tracking is determined not only by an architecture's theoretical expressivity but crucially by its error control.

오류 제어 동역학을 통한 순환 모델의 상태 추적 재고찰

Rethinking State Tracking in Recurrent Models Through Error Control Dynamics

초록

Support