대규모 언어 모델을 위한 인-컨텍스트 예제 검색 학습

초록

대규모 언어 모델(LLM)은 컨텍스트 내 학습 능력을 보여주며, 몇 가지 입력-출력 예제를 기반으로 다양한 작업을 수행할 수 있습니다. 그러나 컨텍스트 내 학습의 효과는 선택된 예제의 품질에 크게 의존합니다. 본 논문에서는 LLM을 위한 고품질 컨텍스트 내 예제를 식별할 수 있는 밀집 검색기를 반복적으로 훈련시키는 새로운 프레임워크를 제안합니다. 우리의 프레임워크는 먼저 LLM 피드백을 기반으로 후보 예제의 품질을 평가하는 보상 모델을 훈련한 후, 지식 증류를 통해 이중 인코더 기반 밀집 검색기를 훈련합니다. 30개 작업에 대한 실험을 통해 우리의 프레임워크가 컨텍스트 내 학습 성능을 크게 향상시킴을 입증했습니다. 또한, 훈련 중에 보지 못한 작업에 대한 프레임워크의 일반화 능력을 보여줍니다. 심층 분석 결과, 우리의 모델은 유사한 패턴을 가진 예제를 검색함으로써 성능을 개선하며, 이러한 성능 향상은 다양한 크기의 LLM에서 일관되게 나타납니다.

English

Large language models (LLMs) have demonstrated their ability to learn in-context, allowing them to perform various tasks based on a few input-output examples. However, the effectiveness of in-context learning is heavily reliant on the quality of the selected examples. In this paper, we propose a novel framework to iteratively train dense retrievers that can identify high-quality in-context examples for LLMs. Our framework initially trains a reward model based on LLM feedback to evaluate the quality of candidate examples, followed by knowledge distillation to train a bi-encoder based dense retriever. Our experiments on a suite of 30 tasks demonstrate that our framework significantly enhances in-context learning performance. Furthermore, we show the generalization ability of our framework to unseen tasks during training. An in-depth analysis reveals that our model improves performance by retrieving examples with similar patterns, and the gains are consistent across LLMs of varying sizes.

대규모 언어 모델을 위한 인-컨텍스트 예제 검색 학습

Learning to Retrieve In-Context Examples for Large Language Models

초록

Support