信任但需验证：DAVinCI框架——语言模型声明推理中的双重归因与验证机制

摘要

大型语言模型（LLMs）在各类自然语言处理任务中展现出卓越的流畅性与通用性，但仍存在事实性错误与幻觉问题。这一局限在医疗、法律及科学传播等高风险领域尤为突出，因为这些领域对可信度与可验证性具有极高要求。本文提出DAVinCI框架——一种双路归因与验证机制，旨在提升LLM输出的事实可靠性与可解释性。DAVinCI采用双阶段工作流程：（1）将生成主张溯源至内部模型组件与外部知识源；（2）通过基于蕴含推理的置信度校准机制逐条验证主张。我们在FEVER、CLIMATE-FEVER等多数据集上评估DAVinCI，并与标准单一验证基线进行对比。实验表明，DAVinCI在分类准确率、归因精确度、召回率及F1分数上均提升5-20%。通过消融实验，我们解析了证据片段选择、重校准阈值和检索质量对系统的贡献。同时开源模块化DAVinCI实现方案，可集成至现有LLM流程。该框架通过融合归因与验证机制，为构建可审计、可信赖的AI系统提供了可扩展路径。本研究推动LLM不仅保持强大能力，更具备责任可溯性。

English

Large Language Models (LLMs) have demonstrated remarkable fluency and versatility across a wide range of NLP tasks, yet they remain prone to factual inaccuracies and hallucinations. This limitation poses significant risks in high-stakes domains such as healthcare, law, and scientific communication, where trust and verifiability are paramount. In this paper, we introduce DAVinCI - a Dual Attribution and Verification framework designed to enhance the factual reliability and interpretability of LLM outputs. DAVinCI operates in two stages: (i) it attributes generated claims to internal model components and external sources; (ii) it verifies each claim using entailment-based reasoning and confidence calibration. We evaluate DAVinCI across multiple datasets, including FEVER and CLIMATE-FEVER, and compare its performance against standard verification-only baselines. Our results show that DAVinCI significantly improves classification accuracy, attribution precision, recall, and F1-score by 5-20%. Through an extensive ablation study, we isolate the contributions of evidence span selection, recalibration thresholds, and retrieval quality. We also release a modular DAVinCI implementation that can be integrated into existing LLM pipelines. By bridging attribution and verification, DAVinCI offers a scalable path to auditable, trustworthy AI systems. This work contributes to the growing effort to make LLMs not only powerful but also accountable.

信任但需验证：DAVinCI框架——语言模型声明推理中的双重归因与验证机制

Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models

摘要

Support