擁有有限資源的大型語言模型的全參數微調

摘要

大型語言模型（LLMs）已經革新了自然語言處理（NLP），但訓練需要大量的GPU資源。降低LLMs訓練的門檻將鼓勵更多研究人員參與，使學術界和社會都受益。儘管現有方法專注於參數高效的微調，調整或添加少量參數，但很少有方法解決在有限資源下調整LLMs的全部參數的挑戰。在這項工作中，我們提出了一種新的優化器，稱為低內存優化（LOMO），它將梯度計算和參數更新合併為一步，以減少內存使用。通過將LOMO與現有的節省內存技術相結合，我們將內存使用量降低到10.8%，相較於標準方法（DeepSpeed解決方案）。因此，我們的方法使得在單台機器上使用8個RTX 3090，每個具有24GB內存，可以對65B模型進行全部參數微調成為可能。

English

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) but demand massive GPU resources for training. Lowering the threshold for LLMs training would encourage greater participation from researchers, benefiting both academia and society. While existing approaches have focused on parameter-efficient fine-tuning, which tunes or adds a small number of parameters, few have addressed the challenge of tuning the full parameters of LLMs with limited resources. In this work, we propose a new optimizer, LOw-Memory Optimization (LOMO), which fuses the gradient computation and the parameter update in one step to reduce memory usage. By integrating LOMO with existing memory saving techniques, we reduce memory usage to 10.8% compared to the standard approach (DeepSpeed solution). Consequently, our approach enables the full parameter fine-tuning of a 65B model on a single machine with 8 RTX 3090, each with 24GB memory.

擁有有限資源的大型語言模型的全參數微調

Full Parameter Fine-tuning for Large Language Models with Limited Resources

摘要

Support