迷失在中间：语言模型如何使用长上下文

摘要

尽管最近的语言模型能够接受长文本作为输入，但关于语言模型在使用更长上下文时的表现如何，我们知之甚少。我们分析了语言模型在两个需要识别输入上下文中相关信息的任务上的表现：多文档问答和键-值检索。我们发现，当相关信息出现在输入上下文的开头或结尾时，性能往往最好，而当模型需要访问长上下文中间的相关信息时，性能会明显下降。此外，随着输入上下文变得更长，即使对于明确设计用于长上下文的模型，性能也会显著降低。我们的分析有助于更好地理解语言模型如何使用其输入上下文，并为未来长上下文模型提供新的评估方案。

English

While recent language models have the ability to take long contexts as input, relatively little is known about how well the language models use longer context. We analyze language model performance on two tasks that require identifying relevant information within their input contexts: multi-document question answering and key-value retrieval. We find that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts. Furthermore, performance substantially decreases as the input context grows longer, even for explicitly long-context models. Our analysis provides a better understanding of how language models use their input context and provides new evaluation protocols for future long-context models.

迷失在中间：语言模型如何使用长上下文

Lost in the Middle: How Language Models Use Long Contexts

摘要

Support