InstructZero: 블랙박스 대규모 언어 모델을 위한 효율적인 명령어 최적화

초록

대규모 언어 모델(LLMs)은 명령어를 따르는 모델이지만, 특히 역전파가 금지된 블랙박스 LLMs의 경우 다양한 상황에 가장 적합한 명령어를 찾는 것은 어려운 과제가 될 수 있습니다. 이산적인 명령어를 직접 최적화하는 대신, 우리는 오픈소스 LLM에 적용되는 저차원의 소프트 프롬프트를 최적화하여 블랙박스 LLM을 위한 명령어를 생성합니다. 우리가 InstructZero라고 명명한 이 방법의 각 반복에서, 소프트 프롬프트는 오픈소스 LLM을 사용하여 명령어로 변환된 후, 블랙박스 LLM에 제출되어 제로샷 평가를 거치고, 그 성능은 베이지안 최적화로 전달되어 제로샷 성능을 개선하는 새로운 소프트 프롬프트를 생성합니다. 우리는 Vicuna와 ChatGPT를 포함한 다양한 오픈소스 LLMs와 API 조합에 대해 InstructZero를 평가합니다. 우리의 결과는 InstructZero가 다양한 다운스트림 작업에서 최신 자동 명령어 생성 방법들을 능가함을 보여줍니다. 우리의 코드와 데이터는 https://github.com/Lichang-Chen/InstructZero에서 공개적으로 이용 가능합니다.

English

Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden. Instead of directly optimizing the discrete instruction, we optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. On each iteration of the proposed method, which we call InstructZero, a soft prompt is converted into an instruction using the open-source LLM, which is then submitted to the black-box LLM for zero-shot evaluation, and the performance is sent to Bayesian optimization to produce new soft prompts improving the zero-shot performance. We evaluate InstructZero on different combinations of open-source LLMs and APIs including Vicuna and ChatGPT. Our results show that InstructZero outperforms SOTA auto-instruction methods across a variety of downstream tasks. Our code and data are publicly available at https://github.com/Lichang-Chen/InstructZero.

InstructZero: 블랙박스 대규모 언어 모델을 위한 효율적인 명령어 최적화

InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models

초록

Support