ChatPaper.aiChatPaper

QuCo-RAG:基于预训练语料库不确定性量化的动态检索增强生成

QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

December 22, 2025
作者: Dehai Min, Kailin Zhang, Tongtong Wu, Lu Cheng
cs.AI

摘要

动态检索增强生成通过自适应地确定生成过程中的检索时机,来缓解大语言模型中的幻觉问题。然而现有方法依赖模型内部信号(如对数概率、熵),这些信号本质上不可靠,因为大语言模型通常存在校准不足问题,且经常对错误输出表现出高置信度。我们提出QuCo-RAG方法,将判断依据从主观置信度转向基于预训练数据计算的客观统计量。该方法通过两个阶段量化不确定性:(1)在生成前识别指示长尾知识缺口的低频实体;(2)在生成过程中验证实体在预训练语料中的共现情况,零共现往往意味着幻觉风险。两个阶段均利用Infini-gram对4万亿词元进行毫秒级延迟查询,当检测到高不确定性时触发检索。在多跳问答基准测试中,QuCo-RAG在OLMo-2模型上相比最先进基线实现了5-12个点的精确匹配提升,并能有效迁移至预训练数据未公开的模型(Llama、Qwen、GPT),最高提升14个点。在生物医学问答领域的泛化测试进一步验证了该范式的鲁棒性。这些结果表明,基于语料库的验证为动态RAG提供了一种原理清晰、实际可模型无关的新范式。我们的代码已开源:https://github.com/ZhishanQ/QuCo-RAG。
English
Dynamic Retrieval-Augmented Generation adaptively determines when to retrieve during generation to mitigate hallucinations in large language models (LLMs). However, existing methods rely on model-internal signals (e.g., logits, entropy), which are fundamentally unreliable because LLMs are typically ill-calibrated and often exhibit high confidence in erroneous outputs. We propose QuCo-RAG, which shifts from subjective confidence to objective statistics computed from pre-training data. Our method quantifies uncertainty through two stages: (1) before generation, we identify low-frequency entities indicating long-tail knowledge gaps; (2) during generation, we verify entity co-occurrence in the pre-training corpus, where zero co-occurrence often signals hallucination risk. Both stages leverage Infini-gram for millisecond-latency queries over 4 trillion tokens, triggering retrieval when uncertainty is high. Experiments on multi-hop QA benchmarks show QuCo-RAG achieves EM gains of 5--12 points over state-of-the-art baselines with OLMo-2 models, and transfers effectively to models with undisclosed pre-training data (Llama, Qwen, GPT), improving EM by up to 14 points. Domain generalization on biomedical QA further validates the robustness of our paradigm. These results establish corpus-grounded verification as a principled, practically model-agnostic paradigm for dynamic RAG. Our code is publicly available at https://github.com/ZhishanQ/QuCo-RAG.
PDF252December 24, 2025