ConfQA: 自信がある場合のみ回答

要旨

大規模言語モデル（LLM）に事実に関する虚構を控えるように教えることは可能か？本論文では、ConfQAと呼ばれるファインチューニング戦略を提案し、複数の事実性ベンチマークにおいて虚構率を20-40%から5%未満に削減できることを示す。核となるアイデアはシンプルである：LLMが質問に正しく答えた場合、その回答を続けるように訓練し、そうでない場合には「わかりません」と認めるように訓練する。しかし、この訓練を非常に効果的にする2つの重要な要素がある。第一に、「自信がある場合にのみ答える」という抑制プロンプトを導入し、これがないと虚構率は15%-25%のまま高い状態が続く。第二に、知識グラフからの属性値のようなシンプルな事実記述を活用し、LLMが自信を調整するのを助けることで、ドメインや質問タイプを超えた堅牢な汎化を実現する。この洞察に基づき、Dual Neural Knowledgeフレームワークを提案する。このフレームワークは、ConfQAの自信に基づいて、内部でパラメータ化されたニューラル知識と外部に記録されたシンボリック知識をシームレスに選択する。このフレームワークにより、潜在的な精度を95%以上に向上させながら、不必要な外部検索を30%以上削減することが可能となる。

English

Can we teach Large Language Models (LLMs) to refrain from hallucinating factual statements? In this paper we present a fine-tuning strategy that we call ConfQA, which can reduce hallucination rate from 20-40% to under 5% across multiple factuality benchmarks. The core idea is simple: when the LLM answers a question correctly, it is trained to continue with the answer; otherwise, it is trained to admit "I am unsure". But there are two key factors that make the training highly effective. First, we introduce a dampening prompt "answer only if you are confident" to explicitly guide the behavior, without which hallucination remains high as 15%-25%. Second, we leverage simple factual statements, specifically attribute values from knowledge graphs, to help LLMs calibrate the confidence, resulting in robust generalization across domains and question types. Building on this insight, we propose the Dual Neural Knowledge framework, which seamlessly select between internally parameterized neural knowledge and externally recorded symbolic knowledge based on ConfQA's confidence. The framework enables potential accuracy gains to beyond 95%, while reducing unnecessary external retrievals by over 30%.

ConfQA: 自信がある場合のみ回答

ConfQA: Answer Only If You Are Confident

要旨

Support