规避早熟崩溃:面向熵正则化结构推断的自适应退火策略
Avoiding Premature Collapse: Adaptive Annealing for Entropy-Regularized Structural Inference
January 30, 2026
作者: Yizhi Liu
cs.AI
摘要
可微分匹配层与残差连接范式(通常通过熵正则化最优传输实现)是结构预测和架构扩展中的关键机制。然而,通过将ε退火至零来恢复离散排列或保持恒等映射的方法存在显著的不稳定性。本研究揭示了该失效的根本机制:早熟模态坍缩。通过分析Sinkhorn定点映射的非正规动力学,我们发现了理论上的热力学速度极限——标准指数冷却速度超过了推断算子的收缩率(该收缩率以O(1/ε)退化)。针对此问题,我们提出高效分段混合自适应稳定性控制(EPH-ASC),该自适应调度算法通过监控推断过程的稳定性,在FineWeb-Edu数据集的大规模训练中有效稳定流形约束超连接(mHC),通过强制执行线性稳定性定律来防止后期梯度爆炸。
English
Differentiable matching layers and residual connection paradigms, often implemented via entropy-regularized Optimal Transport (OT), serve as critical mechanisms in structural prediction and architectural scaling. However, recovering discrete permutations or maintaining identity mappings via annealing εto 0 is notoriously unstable. In this work, we identify a fundamental mechanism for this failure: Premature Mode Collapse. By analyzing the non-normal dynamics of the Sinkhorn fixed-point map, we reveal a theoretical thermodynamic speed limit: standard exponential cooling outpaces the contraction rate of the inference operator, which degrades as O(1/ε). To address this, we propose Efficient Piecewise Hybrid Adaptive Stability Control (EPH-ASC), an adaptive scheduling algorithm that monitors the stability of the inference process. We demonstrate that EPH-ASC is essential for stabilizing Manifold-Constrained Hyper-Connections (mHC) during large-scale training on the FineWeb-Edu dataset, effectively preventing late-stage gradient explosions by enforcing a linear stability law.