Pion: 直交等価変換によるスペクトル保存最適化手法

要旨

我々は、大規模言語モデル（LLM）学習のためのスペクトラム保存最適化手法Pionを導入する。これは直交等価変換に基づく手法である。AdamやMuonのような加法的最適化手法とは異なり、Pionは各重み行列を左および右の直交変換を通じて更新し、学習全体を通じてその特異値を保存する。これにより、重み行列のスペクトルノルムを固定しつつ、その幾何学的構造を調整する最適化メカニズムが実現される。我々はPionの更新ルールを導出し、その設計選択を体系的に検討し、収束挙動といくつかの主要な特性を解析する。実験結果は、PionがLLMの事前学習とファインチューニングの両方において、標準的な最適化手法に対して安定かつ競争力のある代替手段を提供することを示している。

English

We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular values throughout training. This yields an optimization mechanism that modulates the geometry of weight matrices while keeping their spectral norm fixed. We derive the Pion update rule, systematically examine its design choices, and analyze its convergence behavior along with several key properties. Empirical results show that Pion offers a stable and competitive alternative to standard optimizers for both LLM pretraining and finetuning.

Pion: 直交等価変換によるスペクトル保存最適化手法

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

要旨

Support