规避早熟崩溃:面向熵正则化结构推断的自适应退火策略
Avoiding Premature Collapse: Adaptive Annealing for Entropy-Regularized Structural Inference
January 30, 2026
作者: Yizhi Liu
cs.AI
摘要
可微分匹配层与残差连接范式(通常通过熵正则化最优传输实现)是结构预测与架构扩展中的关键机制。然而,通过将ε退火至0来恢复离散排列或维持恒等映射的做法存在显著的不稳定性。本研究揭示了该问题的根本机制:早熟模态崩溃。通过分析Sinkhorn不动点映射的非正规动力学,我们发现了理论上的热力学速度极限:标准指数冷却策略会超越推断算子的收缩速率,而该速率以O(1/ε)退化。针对此问题,我们提出高效分段混合自适应稳定性控制(EPH-ASC),该自适应调度算法通过监控推断过程的稳定性,在FineWeb-Edu数据集的大规模训练中有效稳定流形约束超连接(mHC),通过强制线性稳定性定律成功防止后期梯度爆炸。
English
Differentiable matching layers and residual connection paradigms, often implemented via entropy-regularized Optimal Transport (OT), serve as critical mechanisms in structural prediction and architectural scaling. However, recovering discrete permutations or maintaining identity mappings via annealing εto 0 is notoriously unstable. In this work, we identify a fundamental mechanism for this failure: Premature Mode Collapse. By analyzing the non-normal dynamics of the Sinkhorn fixed-point map, we reveal a theoretical thermodynamic speed limit: standard exponential cooling outpaces the contraction rate of the inference operator, which degrades as O(1/ε). To address this, we propose Efficient Piecewise Hybrid Adaptive Stability Control (EPH-ASC), an adaptive scheduling algorithm that monitors the stability of the inference process. We demonstrate that EPH-ASC is essential for stabilizing Manifold-Constrained Hyper-Connections (mHC) during large-scale training on the FineWeb-Edu dataset, effectively preventing late-stage gradient explosions by enforcing a linear stability law.