圧縮表現のための統一スケーリング則

要旨

スケーリング則は、モデルサイズ、計算量、データ量に基づいてモデルの性能を予測可能な形でスケーリングすることを可能にし、機械学習の最近の進展を形作ってきました。同時に、AIにおける計算コストの上昇は、大規模な訓練や推論に伴う急激な計算需要を緩和するために、量子化やスパース化といったモデル圧縮技術の台頭を促してきました。本論文では、スケーリング則と圧縮形式の相互作用を調査し、スパース、スカラー量子化、スパース量子化、さらにはベクトル量子化といった様々な圧縮表現を用いて訓練を行う場合でも、統一的なスケーリングフレームワークがモデルの性能を正確に予測できるかどうかを探ります。私たちの主な貢献は、一般的なスケーリング則の定式化を検証し、それが個別にだけでなく、複数の圧縮タイプにわたって組み合わせて適用可能であることを示すことです。これに基づき、私たちの主な発見は、ランダムなガウスデータにフィットする能力に基づいた単純な「容量」指標が、複数の圧縮表現にわたってパラメータ効率を頑健に予測できることを理論的かつ実証的に示すことです。実用的な側面では、私たちの定式化を拡張し、異なる圧縮形式の精度ポテンシャルを直接比較し、スパース量子化形式での訓練のためのより良いアルゴリズムを導出します。

English

Scaling laws have shaped recent advances in machine learning by enabling predictable scaling of model performance based on model size, computation, and data volume. Concurrently, the rise in computational cost for AI has motivated model compression techniques, notably quantization and sparsification, which have emerged to mitigate the steep computational demands associated with large-scale training and inference. This paper investigates the interplay between scaling laws and compression formats, exploring whether a unified scaling framework can accurately predict model performance when training occurs over various compressed representations, such as sparse, scalar-quantized, sparse-quantized or even vector-quantized formats. Our key contributions include validating a general scaling law formulation and showing that it is applicable both individually but also composably across compression types. Based on this, our main finding is demonstrating both theoretically and empirically that there exists a simple "capacity" metric -- based on the representation's ability to fit random Gaussian data -- which can robustly predict parameter efficiency across multiple compressed representations. On the practical side, we extend our formulation to directly compare the accuracy potential of different compressed formats, and to derive better algorithms for training over sparse-quantized formats.

圧縮表現のための統一スケーリング則

Unified Scaling Laws for Compressed Representations

要旨

Support