LUMINA：基于上下文知识信号检测RAG系统中的幻觉现象

摘要

检索增强生成（RAG）旨在通过将大语言模型（LLMs）的响应基于检索到的文档来减少幻觉现象。然而，即便提供了正确且充分的上下文，基于RAG的LLMs仍会产生幻觉。一系列研究表明，这源于模型在利用外部上下文与内部知识之间的不平衡，已有多种方法尝试量化这些信号以检测幻觉。然而，现有方法需要大量超参数调优，限制了其泛化能力。我们提出了LUMINA，一个新颖的框架，通过上下文-知识信号检测RAG系统中的幻觉：外部上下文利用通过分布距离量化，而内部知识利用则通过追踪预测标记在Transformer层间的演变来测量。我们进一步引入了一个框架，用于统计验证这些测量结果。在常见的RAG幻觉基准测试和四个开源LLMs上的实验表明，LUMINA在AUROC和AUPRC得分上持续表现优异，在HalluRAG上比之前基于利用的方法高出最多+13%的AUROC。此外，LUMINA在检索质量和模型匹配的宽松假设下仍保持稳健，兼具有效性和实用性。

English

Retrieval-Augmented Generation (RAG) aims to mitigate hallucinations in large language models (LLMs) by grounding responses in retrieved documents. Yet, RAG-based LLMs still hallucinate even when provided with correct and sufficient context. A growing line of work suggests that this stems from an imbalance between how models use external context and their internal knowledge, and several approaches have attempted to quantify these signals for hallucination detection. However, existing methods require extensive hyperparameter tuning, limiting their generalizability. We propose LUMINA, a novel framework that detects hallucinations in RAG systems through context-knowledge signals: external context utilization is quantified via distributional distance, while internal knowledge utilization is measured by tracking how predicted tokens evolve across transformer layers. We further introduce a framework for statistically validating these measurements. Experiments on common RAG hallucination benchmarks and four open-source LLMs show that LUMINA achieves consistently high AUROC and AUPRC scores, outperforming prior utilization-based methods by up to +13% AUROC on HalluRAG. Moreover, LUMINA remains robust under relaxed assumptions about retrieval quality and model matching, offering both effectiveness and practicality.

LUMINA：基于上下文知识信号检测RAG系统中的幻觉现象

LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals

摘要

Support