복잡한 질의응답을 위한 동적 노트 작성을 통한 대형 언어 모델 추론 능력 강화

초록

다중 홉 질의응답을 위한 반복적 RAG는 긴 문맥과 관련 없는 정보의 누적으로 인해 어려움에 직면합니다. 이는 모델이 검색된 내용을 처리하고 추론하는 능력을 저해하며 성능을 제한합니다. 최근 방법들은 검색된 정보를 압축하는 데 초점을 맞추고 있지만, 이들은 단일 라운드 RAG로 제한되거나, 파인튜닝이 필요하거나, 반복적 RAG에서 확장성이 부족합니다. 이러한 문제를 해결하기 위해, 우리는 각 단계에서 검색된 문서로부터 간결하고 관련성 높은 노트를 생성함으로써 잡음을 줄이고 필수적인 정보만을 유지하는 '노트 작성(Notes Writing)' 방법을 제안합니다. 이는 대규모 언어 모델(LLM)의 효과적인 문맥 길이를 간접적으로 증가시켜 더 많은 양의 입력 텍스트를 처리하면서도 더 효과적으로 추론하고 계획할 수 있게 합니다. 노트 작성은 프레임워크에 구애받지 않으며 다양한 반복적 RAG 방법과 통합될 수 있습니다. 우리는 두 가지 모델과 네 가지 평가 데이터셋을 사용하여 세 가지 반복적 RAG 방법에서의 효과를 입증합니다. 노트 작성은 출력 토큰의 최소 증가와 함께 전반적으로 평균 15.6% 포인트의 개선을 가져옵니다.

English

Iterative RAG for multi-hop question answering faces challenges with lengthy contexts and the buildup of irrelevant information. This hinders a model's capacity to process and reason over retrieved content and limits performance. While recent methods focus on compressing retrieved information, they are either restricted to single-round RAG, require finetuning or lack scalability in iterative RAG. To address these challenges, we propose Notes Writing, a method that generates concise and relevant notes from retrieved documents at each step, thereby reducing noise and retaining only essential information. This indirectly increases the effective context length of Large Language Models (LLMs), enabling them to reason and plan more effectively while processing larger volumes of input text. Notes Writing is framework agnostic and can be integrated with different iterative RAG methods. We demonstrate its effectiveness with three iterative RAG methods, across two models and four evaluation datasets. Notes writing yields an average improvement of 15.6 percentage points overall, with minimal increase in output tokens.

복잡한 질의응답을 위한 동적 노트 작성을 통한 대형 언어 모델 추론 능력 강화

Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA

초록

Support