메티스: 고급 저비트 양자화를 활용한 대규모 언어 모델 학습

초록

본 연구는 저비트 양자화를 통해 대규모 언어 모델(LLM)을 학습시키는 데 있어서 기본적인 장벽으로서의 이방성 매개변수 분포를 규명한다: 소수의 지배적인 특이값이 넓은 수치 범위를 생성하며, 이는 블록 단위 양자화의 고유한 편향과 상충된다. 이러한 편향은 높은 크기의 값을 불균형적으로 보존하는 반면 작은 값들은 버리게 되어 학습 불안정성과 낮은 모델 성능을 초래한다. 본 연구는 Metis라는 학습 프레임워크를 제안하며, 이는 (i) 스펙트럼 분해와 랜덤 임베딩을 결합하여 지배적인 구성 요소와 긴 꼬리 구성 요소를 효율적으로 분리하고, 넓은 분포를 양자화에 적합한 좁은 범위로 압축하며; (ii) 스펙트럼 영역에서의 적응형 학습률을 통해 저표현 방향을 증폭하고 성능에 중요한 다양한 특징을 더 잘 포착하며; (iii) 수치 정밀도와 매개변수 범위 분포를 동시에 제약하는 이중 범위 정규화기를 도입하여 안정적이고 편향 없는 저비트 학습을 보장한다. Metis를 통해 FP8 학습은 FP32 기준을 능가하며, FP4 학습은 FP32와 비슷한 정확도를 달성함으로써 고급 저비트 양자화 하에서 견고하고 확장 가능한 LLM 학습의 길을 열었다. Metis의 코드 구현은 https://github.com/typename-yyf/Metis-quantization에서 확인할 수 있다.

English

This work identifies anisotropic parameter distributions as a fundamental barrier to training large language models (LLMs) with low-bit quantization: a few dominant singular values create wide numerical ranges that conflict with the inherent bias of block-wise quantization. This bias disproportionately preserves high-magnitude values while discarding smaller ones, causing training instability and low model performance. This work introduces Metis, a training framework that combines (i) spectral decomposition with random embedding to efficiently disentangle dominant from long-tail components, compressing broad distributions into quantization-friendly narrow ranges; (ii) adaptive learning rates in the spectral domain to amplify underrepresented directions and better capture diverse features critical for performance; and (iii) a dual-range regularizer that jointly constrains numerical precision and parameter range distribution, ensuring stable, unbiased low-bit training. With Metis, FP8 training surpasses FP32 baselines, and FP4 training achieves accuracy comparable to FP32, paving the way for robust and scalable LLM training under advanced low-bit quantization. The code implementation for Metis is available at: https://github.com/typename-yyf/Metis-quantization.

메티스: 고급 저비트 양자화를 활용한 대규모 언어 모델 학습

Metis: Training Large Language Models with Advanced Low-Bit Quantization

초록

Support