検索連鎖強化生成

要旨

本論文では、最終的な回答を生成する前に関連情報を段階的に取得および推論するo1のようなRAGモデルを訓練するアプローチを紹介しています。従来のRAG手法は通常、生成プロセスの前に単一の取得ステップを実行しますが、これは不完全な取得結果により複雑なクエリに対処する際に効果が制限されます。これに対して、提案されたCoRAG（Chain-of-Retrieval Augmented Generation）手法では、モデルが進化する状態に基づいてクエリを動的に再構築できます。CoRAGを効果的に訓練するために、既存のRAGデータセットを補完するために拒否サンプリングを利用して中間取得チェーンを自動生成します。テスト時には、モデルのテスト時計算をスケーリングするために、サンプリングされる取得チェーンの長さと数を制御するためのさまざまなデコーディング戦略を提案します。複数のベンチマークを通じた実験結果は、特にマルチホップ質問応答タスクにおいて、CoRAGの有効性を検証し、強力なベースラインと比較してEMスコアで10ポイント以上の改善が観察されました。KILTベンチマークでは、CoRAGが知識集約的なタスクの幅広い範囲で新たな最先端のパフォーマンスを確立しています。さらに、CoRAGのスケーリング動作を理解するための包括的な分析を提供し、将来の研究の基盤となる事実に基づいたモデルの開発に向けた準備を行っています。

English

This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer. Conventional RAG methods usually perform a single retrieval step before the generation process, which limits their effectiveness in addressing complex queries due to imperfect retrieval results. In contrast, our proposed method, CoRAG (Chain-of-Retrieval Augmented Generation), allows the model to dynamically reformulate the query based on the evolving state. To train CoRAG effectively, we utilize rejection sampling to automatically generate intermediate retrieval chains, thereby augmenting existing RAG datasets that only provide the correct final answer. At test time, we propose various decoding strategies to scale the model's test-time compute by controlling the length and number of sampled retrieval chains. Experimental results across multiple benchmarks validate the efficacy of CoRAG, particularly in multi-hop question answering tasks, where we observe more than 10 points improvement in EM score compared to strong baselines. On the KILT benchmark, CoRAG establishes a new state-of-the-art performance across a diverse range of knowledge-intensive tasks. Furthermore, we offer comprehensive analyses to understand the scaling behavior of CoRAG, laying the groundwork for future research aimed at developing factual and grounded foundation models.