적은 LLM, 더 많은 문서: 향상된 RAG를 위한 탐색

초록

검색 증강 생성(Retrieval-Augmented Generation, RAG)은 문서 검색과 대형 언어 모델(Large Language Models, LLMs)을 결합한 기술이다. 생성기의 규모를 확장하면 정확도가 향상되지만, 이는 비용을 증가시키고 배포 가능성을 제한한다. 우리는 대형 LLM에 대한 의존도를 줄이기 위해 검색기의 코퍼스를 확장하는 수직적인 접근 방식을 탐구한다. 실험 결과, 코퍼스 확장은 RAG의 성능을 지속적으로 강화하며, 모델 크기를 증가시키는 대안으로 자주 활용될 수 있음을 보여준다. 다만, 규모가 커질수록 수익 체감 현상이 발생한다. 소형 및 중형 생성기를 더 큰 코퍼스와 결합하면, 더 작은 코퍼스를 사용하는 훨씬 더 큰 모델과 비슷한 성능을 보이는 경우가 많다. 특히 중형 모델이 가장 큰 이점을 얻는 반면, 초소형 및 대형 모델은 상대적으로 덜 이득을 본다. 우리의 분석은 이러한 개선이 주로 답변을 포함하는 문단의 커버리지 증가에서 비롯되며, 활용 효율성은 크게 변하지 않음을 보여준다. 이러한 결과는 코퍼스와 생성기 간의 원칙적인 트레이드오프를 확립한다: 더 큰 코퍼스에 투자하는 것은 LLM 자체를 확장하는 것과 비슷한 효과를 제공하며, RAG의 성능을 강화하는 효과적인 방법이다.

English

Retrieval-Augmented Generation (RAG) couples document retrieval with large language models (LLMs). While scaling generators improves accuracy, it also raises cost and limits deployability. We explore an orthogonal axis: enlarging the retriever's corpus to reduce reliance on large LLMs. Experimental results show that corpus scaling consistently strengthens RAG and can often serve as a substitute for increasing model size, though with diminishing returns at larger scales. Small- and mid-sized generators paired with larger corpora often rival much larger models with smaller corpora; mid-sized models tend to gain the most, while tiny and large models benefit less. Our analysis shows that improvements arise primarily from increased coverage of answer-bearing passages, while utilization efficiency remains largely unchanged. These findings establish a principled corpus-generator trade-off: investing in larger corpora offers an effective path to stronger RAG, often comparable to enlarging the LLM itself.

적은 LLM, 더 많은 문서: 향상된 RAG를 위한 탐색

Less LLM, More Documents: Searching for Improved RAG

초록

Support