被拖入冲突：在搜索增强型大语言模型中检测与解决源信息冲突

摘要

检索增强生成（RAG）是一种广泛采用的方法，旨在为大型语言模型（LLMs）提供相关且最新的信息。然而，检索到的资料往往包含相互矛盾的信息，而模型应如何处理这些分歧仍不明确。在本研究中，我们首先提出了一种新颖的RAG知识冲突类型分类法，并针对每种类型阐述了模型应有的行为准则。随后，我们引入了CONFLICTS，这是一个在真实RAG场景下由专家标注冲突类型的高质量基准。CONFLICTS是首个能够追踪模型在应对多种知识冲突方面进展的基准。我们在此基准上进行了大量实验，结果表明LLMs在处理来源间的冲突时常常力不从心。尽管通过提示LLMs显式地推理检索文档中的潜在冲突，显著提升了其回答的质量与适宜性，但未来研究仍有广阔的改进空间。

English

Retrieval Augmented Generation (RAG) is a commonly used approach for enhancing large language models (LLMs) with relevant and up-to-date information. However, the retrieved sources can often contain conflicting information and it remains unclear how models should address such discrepancies. In this work, we first propose a novel taxonomy of knowledge conflict types in RAG, along with the desired model behavior for each type. We then introduce CONFLICTS, a high-quality benchmark with expert annotations of conflict types in a realistic RAG setting. CONFLICTS is the first benchmark that enables tracking progress on how models address a wide range of knowledge conflicts. We conduct extensive experiments on this benchmark, showing that LLMs often struggle to appropriately resolve conflicts between sources. While prompting LLMs to explicitly reason about the potential conflict in the retrieved documents significantly improves the quality and appropriateness of their responses, substantial room for improvement in future research remains.

被拖入冲突：在搜索增强型大语言模型中检测与解决源信息冲突

DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs

摘要

Support