RAVEN:具有检索增强编码器-解码器的上下文学习语言模型
RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models
August 15, 2023
作者: Jie Huang, Wei Ping, Peng Xu, Mohammad Shoeybi, Kevin Chen-Chuan Chang, Bryan Catanzaro
cs.AI
摘要
本文研究了检索增强型编码器-解码器语言模型的上下文学习能力。我们首先对最先进的ATLAS模型进行了全面分析,并确定其在上下文学习方面存在的局限性,主要是由于预训练和测试之间的不匹配,以及受限的上下文长度。为了解决这些问题,我们提出了RAVEN模型,该模型结合了检索增强的掩蔽语言建模和前缀语言建模。我们进一步引入了融合上下文学习,通过增强少样本性能,使模型能够利用更多上下文示例,而无需额外的训练或模型修改。通过大量实验,我们证明了RAVEN明显优于ATLAS,并在某些情况下取得了与最先进的语言模型相媲美的结果,尽管参数明显较少。我们的工作强调了检索增强型编码器-解码器语言模型在上下文学习方面的潜力,并鼓励在这个方向进一步开展研究。
English
In this paper, we investigate the in-context learning ability of
retrieval-augmented encoder-decoder language models. We first conduct a
comprehensive analysis of the state-of-the-art ATLAS model and identify its
limitations in in-context learning, primarily due to a mismatch between
pretraining and testing, as well as a restricted context length. To address
these issues, we propose RAVEN, a model that combines retrieval-augmented
masked language modeling and prefix language modeling. We further introduce
Fusion-in-Context Learning to enhance the few-shot performance by enabling the
model to leverage more in-context examples without requiring additional
training or model modifications. Through extensive experiments, we demonstrate
that RAVEN significantly outperforms ATLAS and achieves results comparable to
the most advanced language models in certain scenarios, despite having
substantially fewer parameters. Our work underscores the potential of
retrieval-augmented encoder-decoder language models for in-context learning and
encourages further research in this direction.