ConfQA：僅在確信時作答

摘要

我們能否教導大型語言模型（LLMs）避免在陳述事實時產生幻覺？本文提出了一種名為ConfQA的微調策略，該策略能將多個事實性基準測試中的幻覺率從20-40%降至5%以下。其核心思想簡單明瞭：當LLM正確回答問題時，模型被訓練繼續提供答案；反之，則被訓練承認“我不確定”。然而，有兩個關鍵因素使得此訓練極為有效。首先，我們引入了一種抑制性提示“僅在確信時作答”，以明確引導模型行為，若無此提示，幻覺率仍高達15%-25%。其次，我們利用簡單的事實陳述，特別是來自知識圖譜的屬性值，來幫助LLMs校準信心，從而實現跨領域和問題類型的穩健泛化。基於這一洞見，我們提出了雙重神經知識框架，該框架根據ConfQA的信心度，無縫地在內部參數化的神經知識與外部記錄的符號知識之間進行選擇。此框架不僅有望將準確率提升至95%以上，還能減少超過30%的不必要外部檢索。

English

Can we teach Large Language Models (LLMs) to refrain from hallucinating factual statements? In this paper we present a fine-tuning strategy that we call ConfQA, which can reduce hallucination rate from 20-40% to under 5% across multiple factuality benchmarks. The core idea is simple: when the LLM answers a question correctly, it is trained to continue with the answer; otherwise, it is trained to admit "I am unsure". But there are two key factors that make the training highly effective. First, we introduce a dampening prompt "answer only if you are confident" to explicitly guide the behavior, without which hallucination remains high as 15%-25%. Second, we leverage simple factual statements, specifically attribute values from knowledge graphs, to help LLMs calibrate the confidence, resulting in robust generalization across domains and question types. Building on this insight, we propose the Dual Neural Knowledge framework, which seamlessly select between internally parameterized neural knowledge and externally recorded symbolic knowledge based on ConfQA's confidence. The framework enables potential accuracy gains to beyond 95%, while reducing unnecessary external retrievals by over 30%.

ConfQA：僅在確信時作答

ConfQA: Answer Only If You Are Confident

摘要

Support