LUMINA: 컨텍스트-지식 신호를 활용한 RAG 시스템의 환각 현상 탐지

초록

검색 증강 생성(Retrieval-Augmented Generation, RAG)은 대규모 언어 모델(LLMs)의 환각 현상을 완화하기 위해 검색된 문서에 기반한 응답을 생성하는 것을 목표로 합니다. 그러나 RAG 기반 LLMs는 올바르고 충분한 문맥이 제공된 경우에도 여전히 환각을 일으킵니다. 최근 연구에 따르면, 이는 모델이 외부 문맥과 내부 지식을 활용하는 방식 간의 불균형에서 비롯된 것으로 보이며, 여러 접근법이 이러한 신호를 정량화하여 환각을 탐지하려 시도해 왔습니다. 그러나 기존 방법들은 광범위한 하이퍼파라미터 조정이 필요하여 일반화 가능성이 제한적입니다. 본 연구에서는 문맥-지식 신호를 통해 RAG 시스템에서의 환각을 탐지하는 새로운 프레임워크인 LUMINA를 제안합니다: 외부 문맥 활용은 분포적 거리를 통해 정량화되고, 내부 지식 활용은 트랜스포머 계층 간 예측된 토큰의 변화를 추적하여 측정됩니다. 또한, 이러한 측정값을 통계적으로 검증하기 위한 프레임워크를 도입합니다. 일반적인 RAG 환각 벤치마크와 4개의 오픈소스 LLMs에 대한 실험 결과, LUMINA는 일관되게 높은 AUROC 및 AUPRC 점수를 달성하며, HalluRAG에서 기존 활용 기반 방법보다 최대 +13% AUROC 성능 향상을 보였습니다. 더욱이, LUMINA는 검색 품질과 모델 매칭에 대한 완화된 가정 하에서도 견고하게 작동하여 효과성과 실용성을 모두 제공합니다.

English

Retrieval-Augmented Generation (RAG) aims to mitigate hallucinations in large language models (LLMs) by grounding responses in retrieved documents. Yet, RAG-based LLMs still hallucinate even when provided with correct and sufficient context. A growing line of work suggests that this stems from an imbalance between how models use external context and their internal knowledge, and several approaches have attempted to quantify these signals for hallucination detection. However, existing methods require extensive hyperparameter tuning, limiting their generalizability. We propose LUMINA, a novel framework that detects hallucinations in RAG systems through context-knowledge signals: external context utilization is quantified via distributional distance, while internal knowledge utilization is measured by tracking how predicted tokens evolve across transformer layers. We further introduce a framework for statistically validating these measurements. Experiments on common RAG hallucination benchmarks and four open-source LLMs show that LUMINA achieves consistently high AUROC and AUPRC scores, outperforming prior utilization-based methods by up to +13% AUROC on HalluRAG. Moreover, LUMINA remains robust under relaxed assumptions about retrieval quality and model matching, offering both effectiveness and practicality.

LUMINA: 컨텍스트-지식 신호를 활용한 RAG 시스템의 환각 현상 탐지

LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals

초록

Support