RAG 시스템은 위치 편향에 취약한가?

초록

검색 강화 생성(Retrieval Augmented Generation)은 외부 코퍼스에서 검색된 구절을 LLM(대형 언어 모델) 프롬프트에 추가함으로써 LLM의 정확성을 향상시킵니다. 본 논문은 위치 편향(positional bias) - LLM이 프롬프트 내 정보의 위치에 따라 다르게 가중치를 부여하는 경향 - 이 관련 구절을 활용하는 LLM의 능력뿐만 아니라 방해 구절에 대한 민감성에도 어떻게 영향을 미치는지 조사합니다. 세 가지 벤치마크에 대한 광범위한 실험을 통해, 최신 검색 파이프라인이 관련 구절을 검색하려 시도하는 과정에서 체계적으로 매우 방해가 되는 구절을 상위 순위로 가져오며, 상위 10개 검색 구절 중 적어도 하나의 고도로 방해되는 구절이 포함된 쿼리가 60% 이상임을 보여줍니다. 결과적으로, 통제된 환경에서 관련 연구들이 종종 매우 두드러진다고 보고하는 LLM의 위치 편향의 영향은 실제 시나리오에서는 관련 구절과 방해 구절이 모두 차례로 불이익을 받기 때문에 사실상 미미합니다. 실제로, 우리의 연구 결과는 LLM의 위치 선호도에 따라 구절을 재배열하려는 정교한 전략들이 무작위 셔플링보다 더 나은 성능을 보이지 않음을 밝혀냅니다.

English

Retrieval Augmented Generation enhances LLM accuracy by adding passages retrieved from an external corpus to the LLM prompt. This paper investigates how positional bias - the tendency of LLMs to weight information differently based on its position in the prompt - affects not only the LLM's capability to capitalize on relevant passages, but also its susceptibility to distracting passages. Through extensive experiments on three benchmarks, we show how state-of-the-art retrieval pipelines, while attempting to retrieve relevant passages, systematically bring highly distracting ones to the top ranks, with over 60% of queries containing at least one highly distracting passage among the top-10 retrieved passages. As a result, the impact of the LLM positional bias, which in controlled settings is often reported as very prominent by related works, is actually marginal in real scenarios since both relevant and distracting passages are, in turn, penalized. Indeed, our findings reveal that sophisticated strategies that attempt to rearrange the passages based on LLM positional preferences do not perform better than random shuffling.

RAG 시스템은 위치 편향에 취약한가?

Do RAG Systems Suffer From Positional Bias?

초록

Support