ReaRAG: 知識誘導型推論が反復的検索拡張生成による大規模推論モデルの事実性を向上

要旨

大規模推論モデル（LRM）は優れた推論能力を示すが、主にパラメトリックな知識に依存しているため、事実の正確性に限界がある。最近の研究では、強化学習（RL）ベースのLRMに検索機能を追加しているが、過剰な思考や推論の堅牢性の欠如に悩まされており、質問応答（QA）タスクでの効果が低下している。この問題に対処するため、我々はReaRAGを提案する。これは、過度な反復なしに多様なクエリを探索する事実性強化型推論モデルである。我々のソリューションには、推論チェーンの長さに上限を設けた新しいデータ構築フレームワークが含まれる。具体的には、まずLRMを活用して慎重な思考を生成し、次に事前定義されたアクション空間（検索と終了）からアクションを選択する。検索アクションの場合、RAGエンジンに対してクエリが実行され、その結果が観測として返され、後の推論ステップを導く。このプロセスは、終了アクションが選択されるまで繰り返される。ReaRAGの強力な推論能力により、我々のアプローチはマルチホップQAにおいて既存のベースラインを上回る。さらに、エラーを認識し推論軌道を洗練する強力な反射能力が分析により明らかになった。本研究は、LRMの事実性を向上させつつ、検索拡張生成（RAG）のための堅牢な推論を効果的に統合するものである。

English

Large Reasoning Models (LRMs) exhibit remarkable reasoning abilities but rely primarily on parametric knowledge, limiting factual accuracy. While recent works equip reinforcement learning (RL)-based LRMs with retrieval capabilities, they suffer from overthinking and lack robustness in reasoning, reducing their effectiveness in question answering (QA) tasks. To address this, we propose ReaRAG, a factuality-enhanced reasoning model that explores diverse queries without excessive iterations. Our solution includes a novel data construction framework with an upper bound on the reasoning chain length. Specifically, we first leverage an LRM to generate deliberate thinking, then select an action from a predefined action space (Search and Finish). For Search action, a query is executed against the RAG engine, where the result is returned as observation to guide reasoning steps later. This process iterates until a Finish action is chosen. Benefiting from ReaRAG's strong reasoning capabilities, our approach outperforms existing baselines on multi-hop QA. Further analysis highlights its strong reflective ability to recognize errors and refine its reasoning trajectory. Our study enhances LRMs' factuality while effectively integrating robust reasoning for Retrieval-Augmented Generation (RAG).

ReaRAG: 知識誘導型推論が反復的検索拡張生成による大規模推論モデルの事実性を向上

ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation

要旨

Support