ChatPaper.aiChatPaper

ReaRAG:知识引导的推理通过迭代检索增强生成提升大型推理模型的真实性

ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation

March 27, 2025
作者: Zhicheng Lee, Shulin Cao, Jinxin Liu, Jiajie Zhang, Weichuan Liu, Xiaoyin Che, Lei Hou, Juanzi Li
cs.AI

摘要

大型推理模型(LRMs)展现出卓越的推理能力,但主要依赖于参数化知识,限制了事实准确性。尽管近期研究为基于强化学习(RL)的LRMs配备了检索功能,它们仍存在过度思考及推理鲁棒性不足的问题,降低了在问答(QA)任务中的效能。为此,我们提出了ReaRAG,一种增强事实性的推理模型,它能在不过度迭代的情况下探索多样化查询。我们的解决方案包含一个新颖的数据构建框架,对推理链长度设定了上限。具体而言,我们首先利用LRM生成深思熟虑的思考,随后从预定义的动作空间(搜索与完成)中选择一个动作。对于搜索动作,查询会针对RAG引擎执行,其结果作为观察返回,以指导后续推理步骤。此过程迭代直至选择完成动作。得益于ReaRAG强大的推理能力,我们的方法在多跳QA任务上超越了现有基线。进一步分析凸显了其识别错误并优化推理轨迹的强反思能力。本研究在提升LRMs事实性的同时,有效整合了检索增强生成(RAG)中的稳健推理。
English
Large Reasoning Models (LRMs) exhibit remarkable reasoning abilities but rely primarily on parametric knowledge, limiting factual accuracy. While recent works equip reinforcement learning (RL)-based LRMs with retrieval capabilities, they suffer from overthinking and lack robustness in reasoning, reducing their effectiveness in question answering (QA) tasks. To address this, we propose ReaRAG, a factuality-enhanced reasoning model that explores diverse queries without excessive iterations. Our solution includes a novel data construction framework with an upper bound on the reasoning chain length. Specifically, we first leverage an LRM to generate deliberate thinking, then select an action from a predefined action space (Search and Finish). For Search action, a query is executed against the RAG engine, where the result is returned as observation to guide reasoning steps later. This process iterates until a Finish action is chosen. Benefiting from ReaRAG's strong reasoning capabilities, our approach outperforms existing baselines on multi-hop QA. Further analysis highlights its strong reflective ability to recognize errors and refine its reasoning trajectory. Our study enhances LRMs' factuality while effectively integrating robust reasoning for Retrieval-Augmented Generation (RAG).

Summary

AI-Generated Summary

PDF284March 28, 2025