Quartetto II: Pre-Addestramento Preciso di LLM in NVFP4 tramite Stima Migliorata del Gradiente Non Distorto

Abstract

Il formato a bassa precisione NVFP4, supportato a livello hardware dalle GPU NVIDIA Blackwell, promette di consentire per la prima volta il pre-addestramento end-to-end completamente quantizzato di modelli massivi come gli LLM. Tuttavia, i metodi di addestramento quantizzato esistenti sacrificano ancora parte della capacità di rappresentazione di questo formato a favore di una stima del gradiente quantizzato non distorto e più accurata mediante arrotondamento stocastico (SR), perdendo un'accuratezza significativa rispetto all'addestramento standard con FP16 e FP8. In questo articolo, miglioriamo lo stato dell'arte per l'addestramento quantizzato in NVFP4 tramite una nuova routine di quantizzazione non distorta per formati a micro-scala, denominata MS-EDEN, che presenta un errore di quantizzazione più di 2 volte inferiore rispetto all'SR. La integriamo in un nuovo schema di quantizzazione completamente in NVFP4 per i layer lineari, chiamato Quartet II. Dimostriamo analiticamente che Quartet II ottiene una stima del gradiente costantemente migliore in tutte le principali moltiplicazioni di matrici, sia nelle passate in avanti che in quelle all'indietro. Inoltre, la nostra proposta si integra bene con i recenti miglioramenti dell'addestramento mirati specificamente a NVFP4. Convalidiamo ulteriormente Quartet II su addestramenti LLM end-to-end con fino a 1,9 miliardi di parametri su 38 miliardi di token. Forniamo kernel per l'esecuzione su GPU NVIDIA Blackwell con un speedup fino a 4,2x rispetto a BF16. Il nostro codice è disponibile all'indirizzo https://github.com/IST-DASLab/Quartet-II.

English

The NVFP4 lower-precision format, supported in hardware by NVIDIA Blackwell GPUs, promises to allow, for the first time, end-to-end fully-quantized pre-training of massive models such as LLMs. Yet, existing quantized training methods still sacrifice some of the representation capacity of this format in favor of more accurate unbiased quantized gradient estimation by stochastic rounding (SR), losing noticeable accuracy relative to standard FP16 and FP8 training. In this paper, improve the state of the art for quantized training in NVFP4 via a novel unbiased quantization routine for micro-scaled formats, called MS-EDEN, that has more than 2x lower quantization error than SR. We integrate it into a novel fully-NVFP4 quantization scheme for linear layers, called Quartet II. We show analytically that Quartet II achieves consistently better gradient estimation across all major matrix multiplications, both on the forward and on the backward passes. In addition, our proposal synergizes well with recent training improvements aimed specifically at NVFP4. We further validate Quartet II on end-to-end LLM training with up to 1.9B parameters on 38B tokens. We provide kernels for execution on NVIDIA Blackwell GPUs with up to 4.2x speedup over BF16. Our code is available at https://github.com/IST-DASLab/Quartet-II .

Quartetto II: Pre-Addestramento Preciso di LLM in NVFP4 tramite Stima Migliorata del Gradiente Non Distorto

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation

Abstract

Support