EE-Tuning:調整早期退出大型語言模型的經濟且可擴展的解決方案
EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models
February 1, 2024
作者: Xuchen Pan, Yanxi Chen, Yaliang Li, Bolin Ding, Jingren Zhou
cs.AI
摘要
本研究介紹了EE-Tuning,一種輕量且經濟的解決方案,用於訓練/調整早期退出的大型語言模型(LLMs)。與完全參數預訓練的常見方法相比,EE-Tuning通過在參數高效的方式下增加任何預訓練(可能經過微調)的標準LLM的額外早期退出層,從而需要顯著較少的計算資源和訓練數據。我們對EE-Tuning的實現通過廣泛的性能優化實現了優秀的訓練效率,並且由於與3D並行性的完全兼容性,具有良好的可擴展性。系統性實驗的結果驗證了EE-Tuning的功效,確認了可以在有限的訓練預算下實現有效的早期退出LLM推斷。為了讓社區能夠使用早期退出LLMs,我們在https://github.com/pan-x-c/EE-LLM上發布了EE-Tuning實現的源代碼。
English
This work introduces EE-Tuning, a lightweight and economical solution to
training/tuning early-exit large language models (LLMs). In contrast to the
common approach of full-parameter pre-training, EE-Tuning augments any
pre-trained (and possibly fine-tuned) standard LLM with additional early-exit
layers that are tuned in a parameter-efficient manner, which requires
significantly less computational resources and training data. Our
implementation of EE-Tuning achieves outstanding training efficiency via
extensive performance optimizations, as well as scalability due to its full
compatibility with 3D parallelism. Results of systematic experiments validate
the efficacy of EE-Tuning, confirming that effective early-exit LLM inference
can be achieved with a limited training budget. In hope of making early-exit
LLMs accessible to the community, we release the source code of our
implementation of EE-Tuning at https://github.com/pan-x-c/EE-LLM.