ChatPaper.aiChatPaper

集成教导:利用异质LM混合生成指导调整数据

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

October 21, 2023
作者: Young-Suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo
cs.AI

摘要

利用上下文学习(ICL)进行数据生成,诸如自我指导(Wang等,2023年)或后续的Alpaca(Taori等,2023年)等技术可以在仅有少量人类监督的情况下训练出强大的对话代理。这些方法的一个局限性在于它们依赖非常庞大的语言模型(约1750亿参数),而且这些模型也是专有的且不公开的。在这里,我们探讨将这些技术应用于参数规模更小(约100亿至400亿参数)且具有宽松许可的语言模型的可能性。我们发现自我指导方法在这些规模下效果较差,并提出了新的ICL方法,这些方法基于两个主要思想:(a)对ICL模板进行分类和简化,以使LM更容易学习提示,以及(b)对多个LM输出进行集成,以帮助选择高质量的合成示例。我们的算法利用175个自我指导种子任务,并为需要输入和不需要输入的指令采用独立的流程。通过对不同LM进行实证研究,我们发现:(1)我们提出的方法产生比自我指导更高质量的指导调整数据,(2)它显著提高了普通LM和经过指导调整的LM的性能,(3)较小的经过指导调整的LM生成比其较大的未调整对应物更有用的输出。我们的代码库可在https://github.com/IBM/ensemble-instruct获取。
English
Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One limitation of these approaches is that they resort to very large language models (around 175B parameters) that are also proprietary and non-public. Here we explore the application of such techniques to language models that are much smaller (around 10B--40B parameters) and have permissive licenses. We find the Self-Instruct approach to be less effective at these sizes and propose new ICL methods that draw on two main ideas: (a) Categorization and simplification of the ICL templates to make prompt learning easier for the LM, and (b) Ensembling over multiple LM outputs to help select high-quality synthetic examples. Our algorithm leverages the 175 Self-Instruct seed tasks and employs separate pipelines for instructions that require an input and instructions that do not. Empirical investigations with different LMs show that: (1) Our proposed method yields higher-quality instruction tuning data than Self-Instruct, (2) It improves performances of both vanilla and instruction-tuned LMs by significant margins, and (3) Smaller instruction-tuned LMs generate more useful outputs than their larger un-tuned counterparts. Our codebase is available at https://github.com/IBM/ensemble-instruct.
PDF52December 15, 2024