深度學習的微調數據格式

摘要

窄位元寬度的資料格式對於降低現代深度學習應用的計算和存儲成本至關重要。本文評估了結合每個區塊縮放因子與窄浮點和整數類型的 Microscaling（MX）資料格式，用於個別元素。MX格式平衡了硬體效率、模型準確性和用戶摩擦之間的競爭需求。對超過兩打基準測試的實證結果顯示，MX資料格式作為AI推理和訓練的基準FP32的可替代品，並具有低用戶摩擦。我們還展示了首次在次8位權重、激活和梯度下訓練生成式語言模型的實例，並且在最小準確度損失的情況下，無需對訓練配方進行修改。

English

Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements.MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical results on over two dozen benchmarks demonstrate practicality of MX data formats as a drop-in replacement for baseline FP32 for AI inference and training with low user friction. We also show the first instance of training generative language models at sub-8-bit weights, activations, and gradients with minimal accuracy loss and no modifications to the training recipe.

深度學習的微調數據格式

Microscaling Data Formats for Deep Learning

摘要

Support