ChatPaper.aiChatPaper

ROOT:面向神经网络训练的鲁棒正交化优化器

ROOT: Robust Orthogonalized Optimizer for Neural Network Training

November 25, 2025
作者: Wei He, Kai Han, Hang Zhou, Hanting Chen, Zhicheng Liu, Xinghao Chen, Yunhe Wang
cs.AI

摘要

大型语言模型(LLM)的优化仍是关键挑战,尤其在模型规模扩大会加剧算法不精确性与训练不稳定性的敏感度时。尽管近期优化器通过动量正交化提升了收敛效率,但仍存在两个关键鲁棒性缺陷:正交化精度的维度脆弱性以及对异常值引发噪声的敏感性。为解决这些鲁棒性挑战,我们提出ROOT(鲁棒正交化优化器),通过双重鲁棒机制增强训练稳定性。首先,我们开发了维度鲁棒的正交化方案,采用自适应牛顿迭代法并针对特定矩阵尺寸定制细粒度系数,确保在不同架构配置下保持稳定精度。其次,我们通过近端优化引入优化鲁棒框架,在保留有效梯度方向的同时抑制异常值噪声。大量实验表明,相较于Muon和基于Adam的优化器,ROOT在噪声环境和非凸场景中显著提升鲁棒性,实现更快的收敛速度与更优的最终性能。本研究为开发能够应对现代大规模模型训练复杂性的鲁棒精准优化器确立了新范式。代码已发布于https://github.com/huawei-noah/noah-research/tree/master/ROOT。
English
The optimization of large language models (LLMs) remains a critical challenge, particularly as model scaling exacerbates sensitivity to algorithmic imprecision and training instability. Recent advances in optimizers have improved convergence efficiency through momentum orthogonalization, but suffer from two key robustness limitations: dimensional fragility in orthogonalization precision and vulnerability to outlier-induced noise. To address these robustness challenges, we introduce ROOT, a Robust Orthogonalized Optimizer that enhances training stability through dual robustness mechanisms. First, we develop a dimension-robust orthogonalization scheme using adaptive Newton iterations with fine-grained coefficients tailored to specific matrix sizes, ensuring consistent precision across diverse architectural configurations. Second, we introduce an optimization-robust framework via proximal optimization that suppresses outlier noise while preserving meaningful gradient directions. Extensive experiments demonstrate that ROOT achieves significantly improved robustness, with faster convergence and superior final performance compared to both Muon and Adam-based optimizers, particularly in noisy and non-convex scenarios. Our work establishes a new paradigm for developing robust and precise optimizers capable of handling the complexities of modern large-scale model training. The code will be available at https://github.com/huawei-noah/noah-research/tree/master/ROOT.
PDF1654December 1, 2025