EE-Tuning：調整早期退出大型語言模型的經濟且可擴展的解決方案

摘要

本研究介紹了EE-Tuning，一種輕量且經濟的解決方案，用於訓練/調整早期退出的大型語言模型（LLMs）。與完全參數預訓練的常見方法相比，EE-Tuning通過在參數高效的方式下增加任何預訓練（可能經過微調）的標準LLM的額外早期退出層，從而需要顯著較少的計算資源和訓練數據。我們對EE-Tuning的實現通過廣泛的性能優化實現了優秀的訓練效率，並且由於與3D並行性的完全兼容性，具有良好的可擴展性。系統性實驗的結果驗證了EE-Tuning的功效，確認了可以在有限的訓練預算下實現有效的早期退出LLM推斷。為了讓社區能夠使用早期退出LLMs，我們在https://github.com/pan-x-c/EE-LLM上發布了EE-Tuning實現的源代碼。

English

This work introduces EE-Tuning, a lightweight and economical solution to training/tuning early-exit large language models (LLMs). In contrast to the common approach of full-parameter pre-training, EE-Tuning augments any pre-trained (and possibly fine-tuned) standard LLM with additional early-exit layers that are tuned in a parameter-efficient manner, which requires significantly less computational resources and training data. Our implementation of EE-Tuning achieves outstanding training efficiency via extensive performance optimizations, as well as scalability due to its full compatibility with 3D parallelism. Results of systematic experiments validate the efficacy of EE-Tuning, confirming that effective early-exit LLM inference can be achieved with a limited training budget. In hope of making early-exit LLMs accessible to the community, we release the source code of our implementation of EE-Tuning at https://github.com/pan-x-c/EE-LLM.

EE-Tuning：調整早期退出大型語言模型的經濟且可擴展的解決方案

EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models

摘要

Support