壓縮表徵的統一縮放定律
Unified Scaling Laws for Compressed Representations
June 2, 2025
作者: Andrei Panferov, Alexandra Volkova, Ionut-Vlad Modoranu, Vage Egiazarian, Mher Safaryan, Dan Alistarh
cs.AI
摘要
缩放定律通过基于模型规模、计算量和数据量的可预测性能扩展,塑造了机器学习领域的最新进展。与此同时,人工智能计算成本的上升推动了模型压缩技术的发展,特别是量化和稀疏化技术,这些技术应运而生,以缓解大规模训练和推理所带来的高昂计算需求。本文探讨了缩放定律与压缩格式之间的相互作用,探究在训练过程中采用各种压缩表示(如稀疏、标量量化、稀疏量化甚至向量量化格式)时,统一的缩放框架是否能够准确预测模型性能。我们的主要贡献包括验证了一个通用的缩放定律公式,并展示了它不仅适用于单一压缩类型,还能跨压缩类型组合应用。基于此,我们的核心发现是从理论和实证两方面证明,存在一个基于表示拟合随机高斯数据能力的简单“容量”度量,该度量能够稳健地预测多种压缩表示下的参数效率。在实践层面,我们扩展了该公式,以直接比较不同压缩格式的精度潜力,并推导出在稀疏量化格式上训练的更好算法。
English
Scaling laws have shaped recent advances in machine learning by enabling
predictable scaling of model performance based on model size, computation, and
data volume. Concurrently, the rise in computational cost for AI has motivated
model compression techniques, notably quantization and sparsification, which
have emerged to mitigate the steep computational demands associated with
large-scale training and inference. This paper investigates the interplay
between scaling laws and compression formats, exploring whether a unified
scaling framework can accurately predict model performance when training occurs
over various compressed representations, such as sparse, scalar-quantized,
sparse-quantized or even vector-quantized formats. Our key contributions
include validating a general scaling law formulation and showing that it is
applicable both individually but also composably across compression types.
Based on this, our main finding is demonstrating both theoretically and
empirically that there exists a simple "capacity" metric -- based on the
representation's ability to fit random Gaussian data -- which can robustly
predict parameter efficiency across multiple compressed representations. On the
practical side, we extend our formulation to directly compare the accuracy
potential of different compressed formats, and to derive better algorithms for
training over sparse-quantized formats.