ChatPaper.aiChatPaper

文献归属:运用大型语言模型探究引用关系

Document Attribution: Examining Citation Relationships using Large Language Models

May 9, 2025
作者: Vipula Rawte, Ryan A. Rossi, Franck Dernoncourt, Nedim Lipka
cs.AI

摘要

隨著大型語言模型(LLMs)越來越多地應用於基於文件的任務——如文件摘要、問答及資訊提取——其中用戶需求集中於從提供的文件中檢索資訊,而非依賴模型的參數知識,確保這些系統的可信度與可解釋性已成為一項關鍵議題。應對這一挑戰的核心方法是歸因,即追蹤生成輸出至其源文件。然而,鑑於LLMs可能產生不準確或不精確的回應,評估這些引用的可靠性至關重要。 為此,我們的工作提出了兩種技術:(1)一種零樣本方法,將歸因框架化為一項直接的文本蘊含任務。我們使用flan-ul2的方法在AttributionBench的ID和OOD集上分別比最佳基線提高了0.27%和2.4%。(2)我們還探討了注意力機制在增強歸因過程中的作用。使用較小的LLM,flan-t5-small,其F1分數在除第4層及第8至11層外的幾乎所有層面上均超越了基線。
English
As Large Language Models (LLMs) are increasingly applied to document-based tasks - such as document summarization, question answering, and information extraction - where user requirements focus on retrieving information from provided documents rather than relying on the model's parametric knowledge, ensuring the trustworthiness and interpretability of these systems has become a critical concern. A central approach to addressing this challenge is attribution, which involves tracing the generated outputs back to their source documents. However, since LLMs can produce inaccurate or imprecise responses, it is crucial to assess the reliability of these citations. To tackle this, our work proposes two techniques. (1) A zero-shot approach that frames attribution as a straightforward textual entailment task. Our method using flan-ul2 demonstrates an improvement of 0.27% and 2.4% over the best baseline of ID and OOD sets of AttributionBench, respectively. (2) We also explore the role of the attention mechanism in enhancing the attribution process. Using a smaller LLM, flan-t5-small, the F1 scores outperform the baseline across almost all layers except layer 4 and layers 8 through 11.
PDF32May 13, 2025