AlpaGasus: 더 적은 데이터로 더 나은 Alpaca 모델 학습하기

초록

대규모 언어 모델(LLM)은 지시/응답 데이터에 대한 지시 미세 조정(IFT)을 통해 지시 수행 능력을 획득합니다. 그러나 널리 사용되는 IFT 데이터셋(예: Alpaca의 52k 데이터)은 놀랍게도 잘못되었거나 관련 없는 응답을 포함한 많은 저품질 인스턴스를 포함하고 있어, IFT에 오히려 해가 될 수 있습니다. 본 논문에서는 강력한 LLM(예: ChatGPT)을 사용하여 저품질 데이터를 자동으로 식별하고 제거하는 간단하면서도 효과적인 데이터 선택 전략을 제안합니다. 이를 위해, 우리는 52k Alpaca 데이터에서 필터링된 9k 고품질 데이터만으로 미세 조정된 AlpaGasus를 소개합니다. AlpaGasus는 여러 테스트 세트에서 GPT-4로 평가한 결과 원본 Alpaca를 크게 능가하며, 13B 변형은 테스트 작업에서 교사 LLM(Text-Davinci-003)의 성능을 90% 이상 달성합니다. 또한 7B 변형의 경우 학습 시간을 80분(Alpaca 기준)에서 14분으로 단축하여 5.7배 빠른 학습 속도를 제공합니다. 우리는 Alpaca(7B)와 동일한 에포크 수로 IFT를 적용했지만 더 적은 데이터를 사용했으며, 4개의 NVIDIA A100(80GB) GPU를 사용하고 원본 Alpaca 설정과 하이퍼파라미터를 따랐습니다. 전반적으로, AlpaGasus는 지시 조정 데이터에 일반적으로 적용할 수 있는 새로운 데이터 중심 IFT 패러다임을 보여주며, 더 빠른 학습과 더 나은 지시 수행 모델을 이끌어냅니다. 프로젝트 페이지는 https://lichang-chen.github.io/AlpaGasus/에서 확인할 수 있습니다.

English

Large language models~(LLMs) obtain instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data. However, widely used IFT datasets (e.g., Alpaca's 52k data) surprisingly contain many low-quality instances with incorrect or irrelevant responses, which are misleading and detrimental to IFT. In this paper, we propose a simple and effective data selection strategy that automatically identifies and removes low-quality data using a strong LLM (e.g., ChatGPT). To this end, we introduce AlpaGasus, which is finetuned on only 9k high-quality data filtered from the 52k Alpaca data. AlpaGasus significantly outperforms the original Alpaca as evaluated by GPT-4 on multiple test sets and its 13B variant matches >90% performance of its teacher LLM (i.e., Text-Davinci-003) on test tasks. It also provides 5.7x faster training, reducing the training time for a 7B variant from 80 minutes (for Alpaca) to 14 minutes We apply IFT for the same number of epochs as Alpaca(7B) but on fewer data, using 4timesNVIDIA A100 (80GB) GPUs and following the original Alpaca setting and hyperparameters.. Overall, AlpaGasus demonstrates a novel data-centric IFT paradigm that can be generally applied to instruction-tuning data, leading to faster training and better instruction-following models. Our project page is available at: https://lichang-chen.github.io/AlpaGasus/.

AlpaGasus: 더 적은 데이터로 더 나은 Alpaca 모델 학습하기

AlpaGasus: Training A Better Alpaca with Fewer Data

초록

Support