指令預訓練:語言模型是受監督的多任務學習者。
Instruction Pre-Training: Language Models are Supervised Multitask Learners
June 20, 2024
作者: Daixuan Cheng, Yuxian Gu, Shaohan Huang, Junyu Bi, Minlie Huang, Furu Wei
cs.AI
摘要
無監督多任務預訓練一直是最近語言模型(LMs)取得成功的關鍵方法。然而,監督多任務學習仍然具有重要潛力,因為在後訓練階段對其進行擴展有助於更好的泛化。本文通過提出指導預訓練(Instruction Pre-Training)框架,探索了監督多任務預訓練,該框架可通過可擴展地增加龐大的原始語料庫中的指導-回應對來預訓練LMs。指導-回應對是通過基於開源模型構建的高效指導合成器生成的。在我們的實驗中,我們合成了涵蓋40多個任務類別的2億指導-回應對,以驗證指導預訓練的有效性。在從頭開始的預訓練中,指導預訓練不僅持續增強預訓練基本模型,而且更多地受益於進一步的指導調整。在持續預訓練中,指導預訓練使Llama3-8B能夠與甚至優於Llama3-70B。我們的模型、代碼和數據可在https://github.com/microsoft/LMOps 上獲得。
English
Unsupervised multitask pre-training has been the critical method behind the
recent success of language models (LMs). However, supervised multitask learning
still holds significant promise, as scaling it in the post-training stage
trends towards better generalization. In this paper, we explore supervised
multitask pre-training by proposing Instruction Pre-Training, a framework that
scalably augments massive raw corpora with instruction-response pairs to
pre-train LMs. The instruction-response pairs are generated by an efficient
instruction synthesizer built on open-source models. In our experiments, we
synthesize 200M instruction-response pairs covering 40+ task categories to
verify the effectiveness of Instruction Pre-Training. In pre-training from
scratch, Instruction Pre-Training not only consistently enhances pre-trained
base models but also benefits more from further instruction tuning. In
continual pre-training, Instruction Pre-Training enables Llama3-8B to be
comparable to or even outperform Llama3-70B. Our model, code, and data are
available at https://github.com/microsoft/LMOps.Summary
AI-Generated Summary