學習為大型語言模型檢索上下文示例

摘要

大型語言模型（LLMs）展示了它們能夠學習上下文，使它們能夠基於少量輸入-輸出範例執行各種任務。然而，上下文學習的效果在很大程度上取決於所選範例的質量。在本文中，我們提出了一個新穎的框架，通過迭代訓練密集檢索器，以識別LLMs的高質量上下文範例。我們的框架首先訓練一個基於LLM反饋的獎勵模型，用於評估候選範例的質量，然後進行知識蒸餾，訓練基於雙編碼器的密集檢索器。我們在30個任務套件上的實驗表明，我們的框架顯著提升了上下文學習的性能。此外，我們展示了我們的框架在訓練期間對未見任務的泛化能力。深入分析顯示，我們的模型通過檢索具有相似模式的範例來提高性能，並且這種收益在不同大小的LLMs之間是一致的。

English

Large language models (LLMs) have demonstrated their ability to learn in-context, allowing them to perform various tasks based on a few input-output examples. However, the effectiveness of in-context learning is heavily reliant on the quality of the selected examples. In this paper, we propose a novel framework to iteratively train dense retrievers that can identify high-quality in-context examples for LLMs. Our framework initially trains a reward model based on LLM feedback to evaluate the quality of candidate examples, followed by knowledge distillation to train a bi-encoder based dense retriever. Our experiments on a suite of 30 tasks demonstrate that our framework significantly enhances in-context learning performance. Furthermore, we show the generalization ability of our framework to unseen tasks during training. An in-depth analysis reveals that our model improves performance by retrieving examples with similar patterns, and the gains are consistent across LLMs of varying sizes.

學習為大型語言模型檢索上下文示例

Learning to Retrieve In-Context Examples for Large Language Models

摘要

Support