具有有限资源的大型语言模型的全参数微调

摘要

大型语言模型（LLMs）已经彻底改变了自然语言处理（NLP），但训练需要大量GPU资源。降低LLMs训练的门槛将鼓励更多研究人员参与，使学术界和社会受益。虽然现有方法侧重于参数高效微调，即微调或添加少量参数，但很少有方法解决在有限资源下微调LLMs的全部参数的挑战。在这项工作中，我们提出了一种新的优化器，即低内存优化（LOMO），它将梯度计算和参数更新融合为一步，以减少内存使用。通过将LOMO与现有的节省内存技术相结合，我们将内存使用量降低到标准方法（DeepSpeed解决方案）的10.8％。因此，我们的方法使得在单台配备8个RTX 3090，每个内存为24GB的机器上对65B模型进行全参数微调成为可能。

English

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) but demand massive GPU resources for training. Lowering the threshold for LLMs training would encourage greater participation from researchers, benefiting both academia and society. While existing approaches have focused on parameter-efficient fine-tuning, which tunes or adds a small number of parameters, few have addressed the challenge of tuning the full parameters of LLMs with limited resources. In this work, we propose a new optimizer, LOw-Memory Optimization (LOMO), which fuses the gradient computation and the parameter update in one step to reduce memory usage. By integrating LOMO with existing memory saving techniques, we reduce memory usage to 10.8% compared to the standard approach (DeepSpeed solution). Consequently, our approach enables the full parameter fine-tuning of a 65B model on a single machine with 8 RTX 3090, each with 24GB memory.

具有有限资源的大型语言模型的全参数微调

Full Parameter Fine-tuning for Large Language Models with Limited Resources

摘要

Support