Ensemble-Instruct:使用異質混合的語言模型生成指導調整數據
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
October 21, 2023
作者: Young-Suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo
cs.AI
摘要
利用上下文學習(ICL)進行數據生成,像是自我指導(Wang等,2023年)或後續的Alpaca(Taori等,2023年)等技術可以僅需少量人類監督即可訓練出強大的對話代理人。這些方法的一個限制是它們依賴於非常龐大的語言模型(約175B參數),並且是專有且非公開的。在這裡,我們探索將這些技術應用於規模更小(約10B至40B參數)且具有寬鬆許可的語言模型。我們發現自我指導方法在這些規模下效果較差,並提出了倚賴兩個主要思想的新ICL方法:(a)將ICL模板進行分類和簡化,使提示學習對語言模型更容易,以及(b)對多個語言模型輸出進行集成,以幫助選擇高質量的合成示例。我們的算法利用175個自我指導種子任務,並針對需要輸入和不需要輸入的指令使用獨立的流程。通過對不同語言模型進行實證研究,我們發現:(1)我們提出的方法比自我指導產生更高質量的指導調整數據,(2)它顯著提高了普通語言模型和指導調整語言模型的性能,以及(3)較小的指導調整語言模型比其較大的未調整對應模型產生更有用的輸出。我們的代碼庫可在https://github.com/IBM/ensemble-instruct 找到。
English
Using in-context learning (ICL) for data generation, techniques such as
Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023)
can train strong conversational agents with only a small amount of human
supervision. One limitation of these approaches is that they resort to very
large language models (around 175B parameters) that are also proprietary and
non-public. Here we explore the application of such techniques to language
models that are much smaller (around 10B--40B parameters) and have permissive
licenses. We find the Self-Instruct approach to be less effective at these
sizes and propose new ICL methods that draw on two main ideas: (a)
Categorization and simplification of the ICL templates to make prompt learning
easier for the LM, and (b) Ensembling over multiple LM outputs to help select
high-quality synthetic examples. Our algorithm leverages the 175 Self-Instruct
seed tasks and employs separate pipelines for instructions that require an
input and instructions that do not. Empirical investigations with different LMs
show that: (1) Our proposed method yields higher-quality instruction tuning
data than Self-Instruct, (2) It improves performances of both vanilla and
instruction-tuned LMs by significant margins, and (3) Smaller instruction-tuned
LMs generate more useful outputs than their larger un-tuned counterparts. Our
codebase is available at https://github.com/IBM/ensemble-instruct.