監督式知識讓大型語言模型在上下文中表現更好。

摘要

大型語言模型（LLMs）通過提示工程展現出新興的上下文學習能力。大規模生成模型的最新進展進一步擴展了它們在現實世界語言應用中的使用。然而，在自然語言理解和問答方面，提高LLMs的泛化能力和事實性仍然是一個尚未深入探討的關鍵挑戰。儘管先前的上下文學習研究集中於增強模型以符合用戶的具體指示和質量期望，並避免不需要的輸出，但幾乎沒有工作探討在推論階段使用任務特定微調語言模型（SLMs）來改善LLMs的上下文學習。我們的主要貢獻在於建立一個簡單而有效的框架，增強LLMs的可靠性，因為它：1）泛化超出分布數據，2）闡明LLMs如何從區分模型中受益，以及3）在生成任務中最小化幻覺。通過我們提出的插件方法，Llama 2和ChatGPT的增強版本在泛化能力和事實性方面超越了它們的原始版本。我們提供了一套全面的資源，包括16個精心策劃的數據集、提示、模型檢查點以及涵蓋9個不同任務的LLM輸出。我們的實證分析闡明了將區分模型納入LLMs的優勢，並突顯了我們方法在促進更可靠的LLMs方面的潛力。

English

Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The recent progress in large-scale generative models has further expanded their use in real-world language applications. However, the critical challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. While previous in-context learning research has focused on enhancing models to adhere to users' specific instructions and quality expectations, and to avoid undesired outputs, little to no work has explored the use of task-Specific fine-tuned Language Models (SLMs) to improve LLMs' in-context learning during the inference stage. Our primary contribution is the establishment of a simple yet effective framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks. Using our proposed plug-in method, enhanced versions of Llama 2 and ChatGPT surpass their original versions regarding generalizability and factuality. We offer a comprehensive suite of resources, including 16 curated datasets, prompts, model checkpoints, and LLM outputs across 9 distinct tasks. Our empirical analysis sheds light on the advantages of incorporating discriminative models into LLMs and highlights the potential of our methodology in fostering more reliable LLMs.

監督式知識讓大型語言模型在上下文中表現更好。

Supervised Knowledge Makes Large Language Models Better In-context Learners

摘要

Support