LLMを減らし、文書を増やす：RAGの改善を探る

要旨

検索拡張生成（Retrieval-Augmented Generation, RAG）は、文書検索と大規模言語モデル（LLMs）を組み合わせた手法である。生成器のスケーリングは精度を向上させるが、同時にコストを増大させ、展開可能性を制限する。本研究では、別の軸として、検索器のコーパスを拡大することで大規模LLMへの依存を軽減する方法を探る。実験結果から、コーパスのスケーリングは一貫してRAGの性能を強化し、多くの場合、モデルサイズの増大に代わる手段として機能することが示された。ただし、スケールが大きくなるにつれて収益逓減が観察される。小規模および中規模の生成器を大規模コーパスと組み合わせることで、より大規模なモデルと小規模コーパスの組み合わせに匹敵する性能が得られることが多い。中規模モデルが最も大きな利益を得る傾向がある一方、極小規模および大規模モデルの利益は少ない。分析によれば、性能向上は主に回答を含む文章のカバレッジの増加に起因し、利用効率はほぼ変わらない。これらの知見は、コーパスと生成器のトレードオフに関する原則を確立するものであり、より大規模なコーパスへの投資が、LLM自体を拡大することに匹敵する効果的なRAG強化の道筋を提供することを示している。

English

Retrieval-Augmented Generation (RAG) couples document retrieval with large language models (LLMs). While scaling generators improves accuracy, it also raises cost and limits deployability. We explore an orthogonal axis: enlarging the retriever's corpus to reduce reliance on large LLMs. Experimental results show that corpus scaling consistently strengthens RAG and can often serve as a substitute for increasing model size, though with diminishing returns at larger scales. Small- and mid-sized generators paired with larger corpora often rival much larger models with smaller corpora; mid-sized models tend to gain the most, while tiny and large models benefit less. Our analysis shows that improvements arise primarily from increased coverage of answer-bearing passages, while utilization efficiency remains largely unchanged. These findings establish a principled corpus-generator trade-off: investing in larger corpora offers an effective path to stronger RAG, often comparable to enlarging the LLM itself.

LLMを減らし、文書を増やす：RAGの改善を探る

Less LLM, More Documents: Searching for Improved RAG

要旨

Support