信頼せよ、しかし検証せよ：言語モデルの主張推論における二重帰属と検証の枠組み「DAVinCI」の紹介

要旨

大規模言語モデル（LLM）は、様々な自然言語処理タスクにおいて卓越した流暢さと汎用性を示すが、事実誤認や虚偽生成（ハルシネーション）が生じやすいという課題を依然として抱えている。この制限は、信頼性と検証可能性が極めて重要となる医療、法務、科学コミュニケーションなどの高リスク領域において重大なリスクをもたらす。本論文では、LLMの出力の信頼性と解釈可能性を高めるために設計された二重帰属検証フレームワーク「DAVinCI」を提案する。DAVinCIは二段階で動作する：（i）生成された主張を内部モデル構成要素および外部情報源に帰属させ、（ii）含意関係に基づく推論と信頼度較正を用いて各主張を検証する。FEVERやCLIMATE-FEVERを含む複数のデータセットを用いてDAVinCIを評価し、標準的な検証のみのベースラインと性能を比較した。結果として、DAVinCIは分類精度、帰属の適合率、再現率、F1スコアを5～20%大幅に向上させることが示された。詳細なアブレーション研究を通じて、証拠スパン選択、較正閾値、検索品質それぞれの寄与を分離して検証した。また、既存のLLMパイプラインに統合可能なモジュール型のDAVinCI実装を公開する。帰属と検証を統合することにより、DAVinCIは監査可能で信頼性の高いAIシステムへのスケーラブルな道筋を提供する。本研究は、LLMを単に強力なものとするだけでなく、説明責任を果たすものとするための取り組みに貢献するものである。

English

Large Language Models (LLMs) have demonstrated remarkable fluency and versatility across a wide range of NLP tasks, yet they remain prone to factual inaccuracies and hallucinations. This limitation poses significant risks in high-stakes domains such as healthcare, law, and scientific communication, where trust and verifiability are paramount. In this paper, we introduce DAVinCI - a Dual Attribution and Verification framework designed to enhance the factual reliability and interpretability of LLM outputs. DAVinCI operates in two stages: (i) it attributes generated claims to internal model components and external sources; (ii) it verifies each claim using entailment-based reasoning and confidence calibration. We evaluate DAVinCI across multiple datasets, including FEVER and CLIMATE-FEVER, and compare its performance against standard verification-only baselines. Our results show that DAVinCI significantly improves classification accuracy, attribution precision, recall, and F1-score by 5-20%. Through an extensive ablation study, we isolate the contributions of evidence span selection, recalibration thresholds, and retrieval quality. We also release a modular DAVinCI implementation that can be integrated into existing LLM pipelines. By bridging attribution and verification, DAVinCI offers a scalable path to auditable, trustworthy AI systems. This work contributes to the growing effort to make LLMs not only powerful but also accountable.

信頼せよ、しかし検証せよ：言語モデルの主張推論における二重帰属と検証の枠組み「DAVinCI」の紹介

Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models

要旨

Support