减少大语言模型,增加文档:探索增强型RAG的优化路径
Less LLM, More Documents: Searching for Improved RAG
October 3, 2025
作者: Jingjie Ning, Yibo Kong, Yunfan Long, Jamie Callan
cs.AI
摘要
檢索增強生成(Retrieval-Augmented Generation, RAG)將文檔檢索與大型語言模型(LLMs)相結合。雖然擴展生成器能提升準確性,但同時也增加了成本並限制了部署的靈活性。我們探索了一條正交的途徑:擴大檢索器的語料庫以減少對大型LLMs的依賴。實驗結果表明,語料庫的擴展持續增強了RAG的性能,並常能作為增加模型規模的替代方案,儘管在更大規模下收益遞減。中小型生成器搭配更大的語料庫,往往能與配備較小語料庫的更大模型相媲美;中型模型通常獲益最多,而微型和大型模型的受益則較少。我們的分析顯示,性能提升主要來自於涵蓋更多包含答案的段落,而利用效率基本保持不變。這些發現確立了一種原則性的語料庫與生成器之間的權衡:投資於更大的語料庫,為強化RAG提供了一條有效途徑,其效果常可與擴大LLM本身相提並論。
English
Retrieval-Augmented Generation (RAG) couples document retrieval with large
language models (LLMs). While scaling generators improves accuracy, it also
raises cost and limits deployability. We explore an orthogonal axis: enlarging
the retriever's corpus to reduce reliance on large LLMs. Experimental results
show that corpus scaling consistently strengthens RAG and can often serve as a
substitute for increasing model size, though with diminishing returns at larger
scales. Small- and mid-sized generators paired with larger corpora often rival
much larger models with smaller corpora; mid-sized models tend to gain the
most, while tiny and large models benefit less. Our analysis shows that
improvements arise primarily from increased coverage of answer-bearing
passages, while utilization efficiency remains largely unchanged. These
findings establish a principled corpus-generator trade-off: investing in larger
corpora offers an effective path to stronger RAG, often comparable to enlarging
the LLM itself.