RAVEN:具檔案檢索增強編碼器-解碼器的上下文學習語言模型
RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models
August 15, 2023
作者: Jie Huang, Wei Ping, Peng Xu, Mohammad Shoeybi, Kevin Chen-Chuan Chang, Bryan Catanzaro
cs.AI
摘要
本文探討檢索增強型編碼器-解碼器語言模型的上下文學習能力。我們首先對當前最先進的ATLAS模型進行全面分析,識別其在上下文學習方面存在的限制,主要是由於預訓練和測試之間的不匹配,以及受限的上下文長度。為解決這些問題,我們提出了RAVEN,這是一個結合了檢索增強遮罩語言建模和前綴語言建模的模型。我們進一步引入了融合上下文學習,通過使模型能夠利用更多上下文示例來增強少樣本性能,而無需額外的訓練或模型修改。通過大量實驗,我們證明了RAVEN明顯優於ATLAS,在某些情況下取得了與最先進語言模型可比的結果,儘管參數明顯較少。我們的工作突顯了檢索增強型編碼器-解碼器語言模型在上下文學習方面的潛力,並鼓勵在這個方向進行進一步研究。
English
In this paper, we investigate the in-context learning ability of
retrieval-augmented encoder-decoder language models. We first conduct a
comprehensive analysis of the state-of-the-art ATLAS model and identify its
limitations in in-context learning, primarily due to a mismatch between
pretraining and testing, as well as a restricted context length. To address
these issues, we propose RAVEN, a model that combines retrieval-augmented
masked language modeling and prefix language modeling. We further introduce
Fusion-in-Context Learning to enhance the few-shot performance by enabling the
model to leverage more in-context examples without requiring additional
training or model modifications. Through extensive experiments, we demonstrate
that RAVEN significantly outperforms ATLAS and achieves results comparable to
the most advanced language models in certain scenarios, despite having
substantially fewer parameters. Our work underscores the potential of
retrieval-augmented encoder-decoder language models for in-context learning and
encourages further research in this direction.