DoTA-RAG: 思考集約の動的RAG

要旨

本論文では、高スループットかつ大規模なウェブ知識インデックスに最適化された検索拡張生成システム、DoTA-RAG（Dynamic-of-Thought Aggregation RAG）を紹介する。従来のRAGパイプラインは、大規模で多様なデータセットにおいて高いレイテンシと精度の限界に悩まされることが多い。DoTA-RAGは、これらの課題に対処するために、クエリ書き換え、専門化されたサブインデックスへの動的ルーティング、および多段階の検索とランキングという3段階のパイプラインを採用している。さらに、優れた埋め込みモデルを評価・選択し、大規模なFineWeb-10BTコーパスを再埋め込みすることで、検索性能を向上させた。また、DataMorganaセットアップを用いて、WebOrganizerの幅広いトピックとフォーマットにわたる500の質問からなる多様なQ&Aデータセットを作成した。DoTA-RAGは、低レイテンシを維持しながら、回答正解率を0.752（ベースライン、LiveRAGの事前構築ベクトルストア使用）から1.478に向上させ、Live Challenge Dayでは0.929の正解率を達成した。これらの結果は、DoTA-RAGが大規模かつ進化する知識源への迅速で信頼性の高いアクセスを必要とする分野での実用的な展開の可能性を示している。

English

In this paper, we introduce DoTA-RAG (Dynamic-of-Thought Aggregation RAG), a retrieval-augmented generation system optimized for high-throughput, large-scale web knowledge indexes. Traditional RAG pipelines often suffer from high latency and limited accuracy over massive, diverse datasets. DoTA-RAG addresses these challenges with a three-stage pipeline: query rewriting, dynamic routing to specialized sub-indexes, and multi-stage retrieval and ranking. We further enhance retrieval by evaluating and selecting a superior embedding model, re-embedding the large FineWeb-10BT corpus. Moreover, we create a diverse Q&A dataset of 500 questions generated via the DataMorgana setup across a broad range of WebOrganizer topics and formats. DoTA-RAG improves the answer correctness score from 0.752 (baseline, using LiveRAG pre-built vector store) to 1.478 while maintaining low latency, and it achieves a 0.929 correctness score on the Live Challenge Day. These results highlight DoTA-RAG's potential for practical deployment in domains requiring fast, reliable access to large and evolving knowledge sources.

DoTA-RAG: 思考集約の動的RAG

DoTA-RAG: Dynamic of Thought Aggregation RAG

要旨

Support