RAVEN：具檔案檢索增強編碼器-解碼器的上下文學習語言模型

摘要

本文探討檢索增強型編碼器-解碼器語言模型的上下文學習能力。我們首先對當前最先進的ATLAS模型進行全面分析，識別其在上下文學習方面存在的限制，主要是由於預訓練和測試之間的不匹配，以及受限的上下文長度。為解決這些問題，我們提出了RAVEN，這是一個結合了檢索增強遮罩語言建模和前綴語言建模的模型。我們進一步引入了融合上下文學習，通過使模型能夠利用更多上下文示例來增強少樣本性能，而無需額外的訓練或模型修改。通過大量實驗，我們證明了RAVEN明顯優於ATLAS，在某些情況下取得了與最先進語言模型可比的結果，儘管參數明顯較少。我們的工作突顯了檢索增強型編碼器-解碼器語言模型在上下文學習方面的潛力，並鼓勵在這個方向進行進一步研究。

English

In this paper, we investigate the in-context learning ability of retrieval-augmented encoder-decoder language models. We first conduct a comprehensive analysis of the state-of-the-art ATLAS model and identify its limitations in in-context learning, primarily due to a mismatch between pretraining and testing, as well as a restricted context length. To address these issues, we propose RAVEN, a model that combines retrieval-augmented masked language modeling and prefix language modeling. We further introduce Fusion-in-Context Learning to enhance the few-shot performance by enabling the model to leverage more in-context examples without requiring additional training or model modifications. Through extensive experiments, we demonstrate that RAVEN significantly outperforms ATLAS and achieves results comparable to the most advanced language models in certain scenarios, despite having substantially fewer parameters. Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning and encourages further research in this direction.

RAVEN：具檔案檢索增強編碼器-解碼器的上下文學習語言模型

RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models

摘要

Support