基於模型內部的答案歸因,用於可信的檢索增強生成
Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation
June 19, 2024
作者: Jirui Qi, Gabriele Sarti, Raquel Fernández, Arianna Bisazza
cs.AI
摘要
確保模型答案的可驗證性是檢索增強生成(RAG)在問答(QA)領域中的一個基本挑戰。最近,提出了自引用提示,以使大型語言模型(LLMs)生成支持文件的引用以及他們的答案。然而,自引用的LLMs常常難以符合所需格式,參考不存在的來源,並且無法忠實反映LLMs在生成過程中的上下文使用。在這項工作中,我們提出了MIRAGE--基於模型內部的RAG解釋--一種使用模型內部進行忠實答案歸因的即插即用方法。MIRAGE通過显著性方法檢測上下文敏感的答案標記,並將它們與通過檢索的文檔配對,這些文檔有助於通過檢索方法進行預測。我們在一個多語言抽取式QA數據集上評估了我們提出的方法,發現與人類答案歸因高度一致。在開放式QA上,MIRAGE實現了與自引用相當的引文質量和效率,同時還允許更精細地控制歸因參數。我們的定性評估突出了MIRAGE歸因的忠實性,並強調了將模型內部應用於RAG答案歸因的應用前景。
English
Ensuring the verifiability of model answers is a fundamental challenge for
retrieval-augmented generation (RAG) in the question answering (QA) domain.
Recently, self-citation prompting was proposed to make large language models
(LLMs) generate citations to supporting documents along with their answers.
However, self-citing LLMs often struggle to match the required format, refer to
non-existent sources, and fail to faithfully reflect LLMs' context usage
throughout the generation. In this work, we present MIRAGE --Model
Internals-based RAG Explanations -- a plug-and-play approach using model
internals for faithful answer attribution in RAG applications. MIRAGE detects
context-sensitive answer tokens and pairs them with retrieved documents
contributing to their prediction via saliency methods. We evaluate our proposed
approach on a multilingual extractive QA dataset, finding high agreement with
human answer attribution. On open-ended QA, MIRAGE achieves citation quality
and efficiency comparable to self-citation while also allowing for a
finer-grained control of attribution parameters. Our qualitative evaluation
highlights the faithfulness of MIRAGE's attributions and underscores the
promising application of model internals for RAG answer attribution.Summary
AI-Generated Summary