AlpaGasus: より少ないデータでより優れたAlpacaを訓練する

要旨

大規模言語モデル（LLMs）は、教師ありの指示/応答データに対する指示ファインチューニング（IFT）を通じて指示追従能力を獲得します。しかし、広く使用されているIFTデータセット（例：Alpacaの52kデータ）には、驚くほど多くの低品質なインスタンスが含まれており、それらは誤ったまたは無関係な応答を含み、IFTにとって誤解を招き有害です。本論文では、強力なLLM（例：ChatGPT）を使用して低品質なデータを自動的に識別し除去する、シンプルで効果的なデータ選択戦略を提案します。この目的のために、52kのAlpacaデータからフィルタリングされた9kの高品質データのみでファインチューニングされたAlpaGasusを紹介します。AlpaGasusは、複数のテストセットでGPT-4によって評価された結果、元のAlpacaを大幅に上回り、その13Bバリアントはテストタスクにおいて教師LLM（Text-Davinci-003）の>90%の性能を達成します。また、7Bバリアントのトレーニング時間を80分（Alpacaの場合）から14分に短縮し、5.7倍の高速トレーニングを実現します。IFTはAlpaca(7B)と同じエポック数で適用されますが、より少ないデータを使用し、4台のNVIDIA A100（80GB）GPUを利用し、元のAlpacaの設定とハイパーパラメータに従っています。全体として、AlpaGasusは、指示チューニングデータに一般的に適用可能な新しいデータ中心のIFTパラダイムを示しており、より高速なトレーニングとより優れた指示追従モデルを実現します。プロジェクトページは以下で利用可能です：https://lichang-chen.github.io/AlpaGasus/。

English

Large language models~(LLMs) obtain instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data. However, widely used IFT datasets (e.g., Alpaca's 52k data) surprisingly contain many low-quality instances with incorrect or irrelevant responses, which are misleading and detrimental to IFT. In this paper, we propose a simple and effective data selection strategy that automatically identifies and removes low-quality data using a strong LLM (e.g., ChatGPT). To this end, we introduce AlpaGasus, which is finetuned on only 9k high-quality data filtered from the 52k Alpaca data. AlpaGasus significantly outperforms the original Alpaca as evaluated by GPT-4 on multiple test sets and its 13B variant matches >90% performance of its teacher LLM (i.e., Text-Davinci-003) on test tasks. It also provides 5.7x faster training, reducing the training time for a 7B variant from 80 minutes (for Alpaca) to 14 minutes We apply IFT for the same number of epochs as Alpaca(7B) but on fewer data, using 4timesNVIDIA A100 (80GB) GPUs and following the original Alpaca setting and hyperparameters.. Overall, AlpaGasus demonstrates a novel data-centric IFT paradigm that can be generally applied to instruction-tuning data, leading to faster training and better instruction-following models. Our project page is available at: https://lichang-chen.github.io/AlpaGasus/.

AlpaGasus: より少ないデータでより優れたAlpacaを訓練する

AlpaGasus: Training A Better Alpaca with Fewer Data

要旨

Support