Auto-Instruct: 블랙박스 언어 모델을 위한 자동 명령어 생성 및 순위 지정

초록

대규모 언어 모델(LLMs)은 자연어 지시를 따르는 방식으로 다양한 작업을 수행할 수 있으며, 이때 작업별 미세 조정(fine-tuning)이 필요하지 않습니다. 그러나 LLM의 성능은 이러한 지시의 질에 크게 영향을 받으며, 각 작업에 대해 효과적인 지시문을 수동으로 작성하는 것은 노동 집약적이고 주관적인 과정입니다. 본 논문에서는 LLM에 제공되는 지시문의 질을 자동으로 개선하는 새로운 방법인 Auto-Instruct를 소개합니다. 우리의 방법은 LLM의 내재적 생성 능력을 활용하여 주어진 작업에 대해 다양한 후보 지시문을 생성한 후, 575개의 기존 NLP 작업으로 훈련된 채점 모델을 사용하여 이를 순위 매깁니다. 118개의 도메인 외 작업에 대한 실험에서 Auto-Instruct는 인간이 작성한 지시문과 기존의 LLM 생성 지시문 기준선을 모두 능가했습니다. 또한, 우리의 방법은 훈련 과정에 포함되지 않은 다른 LLM에서도 주목할 만한 일반화 능력을 보여줍니다.

English

Large language models (LLMs) can perform a wide range of tasks by following natural language instructions, without the necessity of task-specific fine-tuning. Unfortunately, the performance of LLMs is greatly influenced by the quality of these instructions, and manually writing effective instructions for each task is a laborious and subjective process. In this paper, we introduce Auto-Instruct, a novel method to automatically improve the quality of instructions provided to LLMs. Our method leverages the inherent generative ability of LLMs to produce diverse candidate instructions for a given task, and then ranks them using a scoring model trained on a variety of 575 existing NLP tasks. In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions. Furthermore, our method exhibits notable generalizability even with other LLMs that are not incorporated into its training process.

Auto-Instruct: 블랙박스 언어 모델을 위한 자동 명령어 생성 및 순위 지정

Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

초록

Support