所见即所存：知识冲突对大语言模型的颠覆性影响

摘要

大型语言模型在执行任务时，常依赖于上下文输入与参数化知识的结合。然而，这两种知识来源可能产生冲突，尤其是在检索到的文档与模型的参数化知识相矛盾时。我们提出了一种诊断框架，用以系统评估在上下文记忆冲突情境下——即上下文信息与其参数化信念相背离时——大语言模型的行为表现。通过构建能引发此类冲突的诊断数据，我们分析了模型在多种任务类型上的表现。研究发现：(1) 知识冲突对无需利用知识的任务影响甚微；(2) 当上下文与参数化知识一致时，模型表现普遍更优；(3) 即便在指令要求下，模型也无法完全抑制其内部知识；(4) 提供解释冲突的理性依据会增加模型对上下文的依赖。这些发现引发了对基于模型评估有效性的担忧，并强调了在部署大语言模型时考虑知识冲突的必要性。

English

Large language models frequently rely on both contextual input and parametric knowledge to perform tasks. However, these sources can come into conflict, especially when retrieved documents contradict the model's parametric knowledge. We propose a diagnostic framework to systematically evaluate LLM behavior under context-memory conflict, where the contextual information diverges from their parametric beliefs. We construct diagnostic data that elicit these conflicts and analyze model performance across multiple task types. Our findings reveal that (1) knowledge conflict has minimal impact on tasks that do not require knowledge utilization, (2) model performance is consistently higher when contextual and parametric knowledge are aligned, (3) models are unable to fully suppress their internal knowledge even when instructed, and (4) providing rationales that explain the conflict increases reliance on contexts. These insights raise concerns about the validity of model-based evaluation and underscore the need to account for knowledge conflict in the deployment of LLMs.

所见即所存：知识冲突对大语言模型的颠覆性影响

What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models

摘要

Support