通過帶符號的梯度下降優化權重四捨五入，用於低精度量化的LLM。

摘要

大型語言模型（LLMs）已證明其在執行與語言相關的任務方面具有卓越的能力。然而，由於它們需要大量的記憶體和存儲空間，它們的部署面臨著重大挑戰。為應對這一問題，僅權重量化，特別是3位和4位的僅權重量化，已成為最可行的解決方案之一。隨著位數的減少，量化網格變得更廣，因此強調了向上和向下舍入的重要性。雖然先前的研究已經證明，在某些情況下，通過微調向上和向下舍入並添加擾動可以提高準確性，但我們的研究受到這些擾動的精確和有限邊界的驅使，僅改變舍入值的閾值具有重要意義。因此，我們提出了一種簡潔而高效的方法來優化權重舍入任務。我們的方法名為SignRound，涉及使用帶符號的梯度下降進行輕量級塊調整，使我們能夠在400個步驟內取得優異的結果。SignRound優於最近方法中已建立的最近舍入（RTN）基準，並且在不引入額外推理開銷的情況下與其競爭得令人印象深刻。源代碼將很快公開在https://github.com/intel/neural-compressor。

English

Large Language Models (LLMs) have proven their exceptional capabilities in performing language-related tasks. However, their deployment poses significant challenges due to their considerable memory and storage requirements. In response to this issue, weight-only quantization, particularly 3 and 4-bit weight-only quantization, has emerged as one of the most viable solutions. As the number of bits decreases, the quantization grid broadens, thus emphasizing the importance of up and down rounding. While previous studies have demonstrated that fine-tuning up and down rounding with the addition of perturbations can enhance accuracy in some scenarios, our study is driven by the precise and limited boundary of these perturbations, where only the threshold for altering the rounding value is of significance. Consequently, we propose a concise and highly effective approach for optimizing the weight rounding task. Our method, named SignRound, involves lightweight block-wise tuning using signed gradient descent, enabling us to achieve outstanding results within 400 steps. SignRound outperforms the established baseline of rounding-to-nearest (RTN) and competes impressively against recent methods, without introducing additional inference overhead. The source code will be publicly available at https://github.com/intel/neural-compressor soon.

通過帶符號的梯度下降優化權重四捨五入，用於低精度量化的LLM。

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

摘要

Support