信任但需验证：DAVinCI框架介绍——语言模型声明推断中的双重归因与验证机制

摘要

大型语言模型（LLMs）在各类自然语言处理任务中展现出卓越的流畅性与通用性，但其仍易产生事实性错误与幻觉。这一局限在医疗、法律及科学传播等高风险领域存在显著隐患，因为这些领域对可信度与可验证性具有极高要求。本文提出DAVinCI框架——一种兼具双重归因与验证功能的架构，旨在提升LLM输出的事实可靠性与可解释性。该框架通过两阶段运作：（一）将生成主张归因于内部模型组件与外部信源；（二）基于蕴含推理与置信度校准对每条主张进行验证。我们在FEVER、CLIMATE-FEVER等多数据集上评估DAVinCI，并与标准纯验证基线模型进行性能对比。实验结果表明，DAVinCI在分类准确率、归因精确度、召回率及F1分数上均实现5%-20%的显著提升。通过系统的消融实验，我们解析了证据跨度选择、重校准阈值及检索质量对结果的独立贡献。同时开源了可集成至现有LLM流程的模块化DAVinCI实现方案。通过桥接归因与验证机制，DAVinCI为构建可审计、可信赖的人工智能系统提供了可扩展路径。本研究为推动LLMs不仅保持强大能力更具备问责性作出了贡献。

English

Large Language Models (LLMs) have demonstrated remarkable fluency and versatility across a wide range of NLP tasks, yet they remain prone to factual inaccuracies and hallucinations. This limitation poses significant risks in high-stakes domains such as healthcare, law, and scientific communication, where trust and verifiability are paramount. In this paper, we introduce DAVinCI - a Dual Attribution and Verification framework designed to enhance the factual reliability and interpretability of LLM outputs. DAVinCI operates in two stages: (i) it attributes generated claims to internal model components and external sources; (ii) it verifies each claim using entailment-based reasoning and confidence calibration. We evaluate DAVinCI across multiple datasets, including FEVER and CLIMATE-FEVER, and compare its performance against standard verification-only baselines. Our results show that DAVinCI significantly improves classification accuracy, attribution precision, recall, and F1-score by 5-20%. Through an extensive ablation study, we isolate the contributions of evidence span selection, recalibration thresholds, and retrieval quality. We also release a modular DAVinCI implementation that can be integrated into existing LLM pipelines. By bridging attribution and verification, DAVinCI offers a scalable path to auditable, trustworthy AI systems. This work contributes to the growing effort to make LLMs not only powerful but also accountable.

信任但需验证：DAVinCI框架介绍——语言模型声明推断中的双重归因与验证机制

Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models

摘要

Support