规避早熟崩溃：面向熵正则化结构推断的自适应退火策略

摘要

可微分匹配层与残差连接范式（通常通过熵正则化最优传输实现）是结构预测与架构扩展中的关键机制。然而，通过将ε退火至0来恢复离散排列或维持恒等映射的做法存在显著的不稳定性。本研究揭示了该问题的根本机制：早熟模态崩溃。通过分析Sinkhorn不动点映射的非正规动力学，我们发现了理论上的热力学速度极限：标准指数冷却策略会超越推断算子的收缩速率，而该速率以O(1/ε)退化。针对此问题，我们提出高效分段混合自适应稳定性控制（EPH-ASC），该自适应调度算法通过监控推断过程的稳定性，在FineWeb-Edu数据集的大规模训练中有效稳定流形约束超连接（mHC），通过强制线性稳定性定律成功防止后期梯度爆炸。

English

Differentiable matching layers and residual connection paradigms, often implemented via entropy-regularized Optimal Transport (OT), serve as critical mechanisms in structural prediction and architectural scaling. However, recovering discrete permutations or maintaining identity mappings via annealing εto 0 is notoriously unstable. In this work, we identify a fundamental mechanism for this failure: Premature Mode Collapse. By analyzing the non-normal dynamics of the Sinkhorn fixed-point map, we reveal a theoretical thermodynamic speed limit: standard exponential cooling outpaces the contraction rate of the inference operator, which degrades as O(1/ε). To address this, we propose Efficient Piecewise Hybrid Adaptive Stability Control (EPH-ASC), an adaptive scheduling algorithm that monitors the stability of the inference process. We demonstrate that EPH-ASC is essential for stabilizing Manifold-Constrained Hyper-Connections (mHC) during large-scale training on the FineWeb-Edu dataset, effectively preventing late-stage gradient explosions by enforcing a linear stability law.