ChatPaper.aiChatPaper

EE-Tuning:调整大型语言模型早期退出的经济且可扩展的解决方案

EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models

February 1, 2024
作者: Xuchen Pan, Yanxi Chen, Yaliang Li, Bolin Ding, Jingren Zhou
cs.AI

摘要

本文介绍了EE-Tuning,这是一种轻量且经济的解决方案,用于训练/调整早期退出的大型语言模型(LLMs)。与完全参数预训练的常见方法相比,EE-Tuning通过在参数高效的方式下调整任何预训练(可能是微调过的)标准LLM,并增加额外的早期退出层,从而需要较少的计算资源和训练数据。我们对EE-Tuning的实现通过广泛的性能优化实现了出色的训练效率,并且由于与3D并行性的完全兼容性,具有良好的可扩展性。系统化实验的结果验证了EE-Tuning的有效性,证实了在有限的训练预算下可以实现有效的早期退出LLM推断。为了让社区能够使用早期退出LLMs,我们在https://github.com/pan-x-c/EE-LLM发布了EE-Tuning的实现源代码。
English
This work introduces EE-Tuning, a lightweight and economical solution to training/tuning early-exit large language models (LLMs). In contrast to the common approach of full-parameter pre-training, EE-Tuning augments any pre-trained (and possibly fine-tuned) standard LLM with additional early-exit layers that are tuned in a parameter-efficient manner, which requires significantly less computational resources and training data. Our implementation of EE-Tuning achieves outstanding training efficiency via extensive performance optimizations, as well as scalability due to its full compatibility with 3D parallelism. Results of systematic experiments validate the efficacy of EE-Tuning, confirming that effective early-exit LLM inference can be achieved with a limited training budget. In hope of making early-exit LLMs accessible to the community, we release the source code of our implementation of EE-Tuning at https://github.com/pan-x-c/EE-LLM.
PDF41December 15, 2024