제한된 자원으로 대규모 언어 모델의 전체 파라미터 미세 조정

초록

대규모 언어 모델(LLMs)은 자연어 처리(NLP) 분야에 혁신을 가져왔지만, 학습을 위해서는 막대한 GPU 자원이 필요합니다. LLMs 학습의 문턱을 낮추는 것은 연구자들의 더 많은 참여를 유도하여 학계와 사회 모두에 이익을 줄 것입니다. 기존 접근법들은 매개변수 효율적 미세 조정에 초점을 맞추어 소수의 매개변수를 조정하거나 추가하는 데 집중해왔지만, 제한된 자원으로 LLMs의 전체 매개변수를 조정하는 문제를 다룬 연구는 거의 없었습니다. 본 연구에서는 메모리 사용량을 줄이기 위해 그래디언트 계산과 매개변수 업데이트를 한 단계로 융합한 새로운 최적화 기법인 LOw-Memory Optimization(LOMO)을 제안합니다. LOMO를 기존의 메모리 절약 기술과 통합함으로써, 표준 접근법(DeepSpeed 솔루션) 대비 메모리 사용량을 10.8%로 줄였습니다. 결과적으로, 우리의 접근법은 24GB 메모리를 가진 8개의 RTX 3090으로 구성된 단일 머신에서 65B 모델의 전체 매개변수 미세 조정을 가능하게 합니다.

English

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) but demand massive GPU resources for training. Lowering the threshold for LLMs training would encourage greater participation from researchers, benefiting both academia and society. While existing approaches have focused on parameter-efficient fine-tuning, which tunes or adds a small number of parameters, few have addressed the challenge of tuning the full parameters of LLMs with limited resources. In this work, we propose a new optimizer, LOw-Memory Optimization (LOMO), which fuses the gradient computation and the parameter update in one step to reduce memory usage. By integrating LOMO with existing memory saving techniques, we reduce memory usage to 10.8% compared to the standard approach (DeepSpeed solution). Consequently, our approach enables the full parameter fine-tuning of a 65B model on a single machine with 8 RTX 3090, each with 24GB memory.

제한된 자원으로 대규모 언어 모델의 전체 파라미터 미세 조정

Full Parameter Fine-tuning for Large Language Models with Limited Resources

초록

Support