ConfQA：仅在确信时作答

摘要

我们能否教会大型语言模型（LLMs）避免虚构事实陈述？本文提出了一种名为ConfQA的微调策略，该策略能够将多个事实性基准测试中的虚构率从20-40%降至5%以下。其核心理念简洁明了：当LLM正确回答问题时，模型被训练继续提供答案；反之，则被训练承认“我不确定”。然而，有两个关键因素使得这一训练极为有效。首先，我们引入了“仅在确信时作答”的抑制提示，以此明确引导模型行为，若无此提示，虚构率仍高达15%-25%。其次，我们利用简单的事实陈述，特别是知识图谱中的属性值，帮助LLMs校准置信度，从而实现跨领域和问题类型的稳健泛化。基于这一洞见，我们提出了双神经知识框架，该框架根据ConfQA的置信度，无缝地在内部参数化的神经知识与外部记录的符号知识之间做出选择。此框架不仅有望将准确率提升至95%以上，还能减少超过30%不必要的外部检索。

English

Can we teach Large Language Models (LLMs) to refrain from hallucinating factual statements? In this paper we present a fine-tuning strategy that we call ConfQA, which can reduce hallucination rate from 20-40% to under 5% across multiple factuality benchmarks. The core idea is simple: when the LLM answers a question correctly, it is trained to continue with the answer; otherwise, it is trained to admit "I am unsure". But there are two key factors that make the training highly effective. First, we introduce a dampening prompt "answer only if you are confident" to explicitly guide the behavior, without which hallucination remains high as 15%-25%. Second, we leverage simple factual statements, specifically attribute values from knowledge graphs, to help LLMs calibrate the confidence, resulting in robust generalization across domains and question types. Building on this insight, we propose the Dual Neural Knowledge framework, which seamlessly select between internally parameterized neural knowledge and externally recorded symbolic knowledge based on ConfQA's confidence. The framework enables potential accuracy gains to beyond 95%, while reducing unnecessary external retrievals by over 30%.

ConfQA：仅在确信时作答

ConfQA: Answer Only If You Are Confident

摘要

Support