ReaRAG：知识引导的推理通过迭代检索增强生成提升大型推理模型的真实性

摘要

大型推理模型（LRMs）展现出卓越的推理能力，但主要依赖于参数化知识，限制了事实准确性。尽管近期研究为基于强化学习（RL）的LRMs配备了检索功能，它们仍存在过度思考及推理鲁棒性不足的问题，降低了在问答（QA）任务中的效能。为此，我们提出了ReaRAG，一种增强事实性的推理模型，它能在不过度迭代的情况下探索多样化查询。我们的解决方案包含一个新颖的数据构建框架，对推理链长度设定了上限。具体而言，我们首先利用LRM生成深思熟虑的思考，随后从预定义的动作空间（搜索与完成）中选择一个动作。对于搜索动作，查询会针对RAG引擎执行，其结果作为观察返回，以指导后续推理步骤。此过程迭代直至选择完成动作。得益于ReaRAG强大的推理能力，我们的方法在多跳QA任务上超越了现有基线。进一步分析凸显了其识别错误并优化推理轨迹的强反思能力。本研究在提升LRMs事实性的同时，有效整合了检索增强生成（RAG）中的稳健推理。

English

Large Reasoning Models (LRMs) exhibit remarkable reasoning abilities but rely primarily on parametric knowledge, limiting factual accuracy. While recent works equip reinforcement learning (RL)-based LRMs with retrieval capabilities, they suffer from overthinking and lack robustness in reasoning, reducing their effectiveness in question answering (QA) tasks. To address this, we propose ReaRAG, a factuality-enhanced reasoning model that explores diverse queries without excessive iterations. Our solution includes a novel data construction framework with an upper bound on the reasoning chain length. Specifically, we first leverage an LRM to generate deliberate thinking, then select an action from a predefined action space (Search and Finish). For Search action, a query is executed against the RAG engine, where the result is returned as observation to guide reasoning steps later. This process iterates until a Finish action is chosen. Benefiting from ReaRAG's strong reasoning capabilities, our approach outperforms existing baselines on multi-hop QA. Further analysis highlights its strong reflective ability to recognize errors and refine its reasoning trajectory. Our study enhances LRMs' factuality while effectively integrating robust reasoning for Retrieval-Augmented Generation (RAG).

ReaRAG：知识引导的推理通过迭代检索增强生成提升大型推理模型的真实性

ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation

摘要

Support