AlpaGasus：用更少的数据训练更好的羊驼

摘要

大型语言模型（LLMs）通过在受监督的指令/响应数据上进行指令微调（IFT）来获得遵循指令的能力。然而，广泛使用的IFT数据集（例如Alpaca的52k数据）出人意料地包含许多质量低劣的实例，其响应不正确或无关，这些数据对IFT是具有误导性和有害的。在本文中，我们提出了一种简单而有效的数据选择策略，利用强大的LLM（例如ChatGPT）自动识别和移除低质量数据。为此，我们引入了AlpaGasus，它仅在从52k Alpaca数据中筛选出的9k高质量数据上进行微调。AlpaGasus在多个测试集上显着优于原始的Alpaca，经由GPT-4评估，其13B变体在测试任务上的性能达到其教师LLM（即Text-Davinci-003）的90%以上。它还提供了5.7倍更快的训练速度，将7B变体的训练时间从80分钟（对于Alpaca）缩短到14分钟。我们应用IFT进行相同数量的时代，如Alpaca（7B）但在更少的数据上，使用4倍NVIDIA A100（80GB）GPU，并遵循原始的Alpaca设置和超参数。总的来说，AlpaGasus展示了一种新颖的以数据为中心的IFT范式，可以广泛应用于指令调整数据，实现更快的训练和更好的遵循指令的模型。我们的项目页面可在以下链接找到：https://lichang-chen.github.io/AlpaGasus/。

English

Large language models~(LLMs) obtain instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data. However, widely used IFT datasets (e.g., Alpaca's 52k data) surprisingly contain many low-quality instances with incorrect or irrelevant responses, which are misleading and detrimental to IFT. In this paper, we propose a simple and effective data selection strategy that automatically identifies and removes low-quality data using a strong LLM (e.g., ChatGPT). To this end, we introduce AlpaGasus, which is finetuned on only 9k high-quality data filtered from the 52k Alpaca data. AlpaGasus significantly outperforms the original Alpaca as evaluated by GPT-4 on multiple test sets and its 13B variant matches >90% performance of its teacher LLM (i.e., Text-Davinci-003) on test tasks. It also provides 5.7x faster training, reducing the training time for a 7B variant from 80 minutes (for Alpaca) to 14 minutes We apply IFT for the same number of epochs as Alpaca(7B) but on fewer data, using 4timesNVIDIA A100 (80GB) GPUs and following the original Alpaca setting and hyperparameters.. Overall, AlpaGasus demonstrates a novel data-centric IFT paradigm that can be generally applied to instruction-tuning data, leading to faster training and better instruction-following models. Our project page is available at: https://lichang-chen.github.io/AlpaGasus/.

AlpaGasus：用更少的数据训练更好的羊驼

AlpaGasus: Training A Better Alpaca with Fewer Data

摘要

Support