微行动：通过可执行的自我推理缓解问答中的知识冲突

摘要

检索增强生成（RAG）系统普遍面临知识冲突问题，即检索到的外部知识与大型语言模型（LLMs）固有的参数化知识相矛盾。这严重影响了下游任务（如问答系统QA）的表现。现有方法通常通过并排比较两种知识源来缓解冲突，但这种方式可能因引入过多或冗长的上下文而使LLMs不堪重负，最终阻碍其识别和解决不一致性的能力。针对这一问题，我们提出了Micro-Act框架，该框架采用分层动作空间，能自动感知上下文复杂度，并自适应地将每个知识源分解为一系列细粒度比较。这些比较以可执行步骤的形式呈现，使得推理能够超越表层上下文。通过在五个基准数据集上的广泛实验，Micro-Act在所有五个数据集和三种冲突类型上均显著超越了现有最先进的基线模型，尤其是在时间和语义类型上，所有基线模型均表现不佳。更重要的是，Micro-Act在非冲突问题上同样展现出稳健性能，凸显了其在现实世界RAG应用中的实用价值。

English

Retrieval-Augmented Generation (RAG) systems commonly suffer from Knowledge Conflicts, where retrieved external knowledge contradicts the inherent, parametric knowledge of large language models (LLMs). It adversely affects performance on downstream tasks such as question answering (QA). Existing approaches often attempt to mitigate conflicts by directly comparing two knowledge sources in a side-by-side manner, but this can overwhelm LLMs with extraneous or lengthy contexts, ultimately hindering their ability to identify and mitigate inconsistencies. To address this issue, we propose Micro-Act a framework with a hierarchical action space that automatically perceives context complexity and adaptively decomposes each knowledge source into a sequence of fine-grained comparisons. These comparisons are represented as actionable steps, enabling reasoning beyond the superficial context. Through extensive experiments on five benchmark datasets, Micro-Act consistently achieves significant increase in QA accuracy over state-of-the-art baselines across all 5 datasets and 3 conflict types, especially in temporal and semantic types where all baselines fail significantly. More importantly, Micro-Act exhibits robust performance on non-conflict questions simultaneously, highlighting its practical value in real-world RAG applications.

微行动：通过可执行的自我推理缓解问答中的知识冲突

Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning

摘要

Support