MCTS-RAG: 몬테카를로 트리 탐색을 활용한 검색 증강 생성 강화

초록

우리는 소규모 언어 모델의 지식 집약적 작업에 대한 추론 능력을 향상시키는 새로운 접근법인 MCTS-RAG를 소개한다. 이 방법은 관련 컨텍스트를 제공하기 위해 검색 증강 생성(Retrieval-Augmented Generation, RAG)을 활용하고, 추론 경로를 정제하기 위해 몬테카를로 트리 탐색(Monte Carlo Tree Search, MCTS)을 사용한다. MCTS-RAG는 반복적인 의사결정 프로세스를 통해 검색과 추론을 동적으로 통합한다. 일반적인 RAG 방법이 추론과 독립적으로 정보를 검색하여 지식을 최적이 아닌 방식으로 통합하거나, 기존의 MCTS 추론이 외부 사실 없이 모델 내부 지식에만 의존하는 것과 달리, MCTS-RAG는 구조화된 추론과 적응형 검색을 결합한다. 이 통합 접근법은 의사결정을 강화하고, 환각(hallucination)을 줄이며, 사실적 정확성과 응답 일관성을 개선한다. 여러 추론 및 지식 집약적 데이터셋(예: ComplexWebQA, GPQA, FoolMeTwice)에 대한 실험 결과는 우리의 방법이 소규모 언어 모델이 GPT-4o와 같은 최첨단 대형 언어 모델과 비슷한 성능을 달성할 수 있도록 하여, 추론 시간 계산을 효과적으로 확장함으로써 소규모 모델의 추론에 새로운 기준을 제시함을 보여준다.

English

We introduce MCTS-RAG, a novel approach that enhances the reasoning capabilities of small language models on knowledge-intensive tasks by leveraging retrieval-augmented generation (RAG) to provide relevant context and Monte Carlo Tree Search (MCTS) to refine reasoning paths. MCTS-RAG dynamically integrates retrieval and reasoning through an iterative decision-making process. Unlike standard RAG methods, which typically retrieve information independently from reasoning and thus integrate knowledge suboptimally, or conventional MCTS reasoning, which depends solely on internal model knowledge without external facts, MCTS-RAG combines structured reasoning with adaptive retrieval. This integrated approach enhances decision-making, reduces hallucinations, and ensures improved factual accuracy and response consistency. The experimental results on multiple reasoning and knowledge-intensive datasets datasets (i.e., ComplexWebQA, GPQA, and FoolMeTwice) show that our method enables small-scale LMs to achieve performance comparable to frontier LLMs like GPT-4o by effectively scaling inference-time compute, setting a new standard for reasoning in small-scale models.

MCTS-RAG: 몬테카를로 트리 탐색을 활용한 검색 증강 생성 강화

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

초록

Support