ConfQA:仅在确信时作答
ConfQA: Answer Only If You Are Confident
June 8, 2025
作者: Yin Huang, Yifan Ethan Xu, Kai Sun, Vera Yan, Alicia Sun, Haidar Khan, Jimmy Nguyen, Mohammad Kachuee, Zhaojiang Lin, Yue Liu, Aaron Colak, Anuj Kumar, Wen-tau Yih, Xin Luna Dong
cs.AI
摘要
我们能否教会大型语言模型(LLMs)避免虚构事实陈述?本文提出了一种名为ConfQA的微调策略,该策略能够将多个事实性基准测试中的虚构率从20-40%降至5%以下。其核心理念简洁明了:当LLM正确回答问题时,模型被训练继续提供答案;反之,则被训练承认“我不确定”。然而,有两个关键因素使得这一训练极为有效。首先,我们引入了“仅在确信时作答”的抑制提示,以此明确引导模型行为,若无此提示,虚构率仍高达15%-25%。其次,我们利用简单的事实陈述,特别是知识图谱中的属性值,帮助LLMs校准置信度,从而实现跨领域和问题类型的稳健泛化。基于这一洞见,我们提出了双神经知识框架,该框架根据ConfQA的置信度,无缝地在内部参数化的神经知识与外部记录的符号知识之间做出选择。此框架不仅有望将准确率提升至95%以上,还能减少超过30%不必要的外部检索。
English
Can we teach Large Language Models (LLMs) to refrain from hallucinating
factual statements? In this paper we present a fine-tuning strategy that we
call ConfQA, which can reduce hallucination rate from 20-40% to under 5% across
multiple factuality benchmarks. The core idea is simple: when the LLM answers a
question correctly, it is trained to continue with the answer; otherwise, it is
trained to admit "I am unsure". But there are two key factors that make the
training highly effective. First, we introduce a dampening prompt "answer only
if you are confident" to explicitly guide the behavior, without which
hallucination remains high as 15%-25%. Second, we leverage simple factual
statements, specifically attribute values from knowledge graphs, to help LLMs
calibrate the confidence, resulting in robust generalization across domains and
question types. Building on this insight, we propose the Dual Neural Knowledge
framework, which seamlessly select between internally parameterized neural
knowledge and externally recorded symbolic knowledge based on ConfQA's
confidence. The framework enables potential accuracy gains to beyond 95%, while
reducing unnecessary external retrievals by over 30%.