EE-Tuning: 조기 종료 대규모 언어 모델 튜닝을 위한 경제적이면서 확장 가능한 솔루션

초록

본 연구는 조기 종료 대형 언어 모델(LLMs)의 학습/튜닝을 위한 가볍고 경제적인 솔루션인 EE-Tuning을 소개한다. 일반적으로 사용되는 전체 파라미터 사전 학습 방식과 달리, EE-Tuning은 사전 학습된(그리고 가능하면 미세 조정된) 표준 LLM에 추가적인 조기 종료 레이어를 부가하여, 파라미터 효율적인 방식으로 튜닝한다. 이는 상당히 적은 계산 자원과 학습 데이터를 요구한다. EE-Tuning의 구현은 광범위한 성능 최적화를 통해 뛰어난 학습 효율성을 달성하며, 3D 병렬화와의 완전한 호환성으로 인해 확장성을 갖춘다. 체계적인 실험 결과는 EE-Tuning의 효능을 검증하며, 제한된 학습 예산으로도 효과적인 조기 종료 LLM 추론이 가능함을 확인한다. 조기 종료 LLMs를 커뮤니티에 보다 쉽게 접근할 수 있도록 하기 위해, EE-Tuning 구현의 소스 코드를 https://github.com/pan-x-c/EE-LLM에서 공개한다.

English

This work introduces EE-Tuning, a lightweight and economical solution to training/tuning early-exit large language models (LLMs). In contrast to the common approach of full-parameter pre-training, EE-Tuning augments any pre-trained (and possibly fine-tuned) standard LLM with additional early-exit layers that are tuned in a parameter-efficient manner, which requires significantly less computational resources and training data. Our implementation of EE-Tuning achieves outstanding training efficiency via extensive performance optimizations, as well as scalability due to its full compatibility with 3D parallelism. Results of systematic experiments validate the efficacy of EE-Tuning, confirming that effective early-exit LLM inference can be achieved with a limited training budget. In hope of making early-exit LLMs accessible to the community, we release the source code of our implementation of EE-Tuning at https://github.com/pan-x-c/EE-LLM.

EE-Tuning: 조기 종료 대규모 언어 모델 튜닝을 위한 경제적이면서 확장 가능한 솔루션

EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models

초록

Support