ChatPaper.aiChatPaper

長上下文查詢聚焦摘要的非結構化證據歸因

Unstructured Evidence Attribution for Long Context Query Focused Summarization

February 20, 2025
作者: Dustin Wright, Zain Muhammad Mujahid, Lu Wang, Isabelle Augenstein, David Jurgens
cs.AI

摘要

大型語言模型(LLMs)能夠根據用戶查詢從極長的上下文中生成連貫的摘要。提取並適當引用證據片段有助於提升這些摘要的透明度和可靠性。然而,LLMs在理解和關注信息方面存在位置偏見,這可能影響證據的引用。以往的研究主要集中在預定義粒度(如句子、段落、文檔等)的證據引用上,而我們提出了在長上下文查詢聚焦摘要中進行非結構化證據引用的任務。我們展示了現有系統在從上下文中生成並正確引用非結構化證據方面的困難,以及證據往往「迷失在中間」的現象。為緩解這一問題,我們創建了「帶有非結構化證據文本的摘要」數據集(SUnsET),這是一個使用新穎的領域無關管道生成的合成數據集,可作為監督數據來適應LLMs執行此任務。我們在五種不同規模的LLMs和四種包含不同文檔類型及長度的數據集上進行了實驗,結果表明,使用SUnsET數據適應後的LLMs比其基礎模型生成更相關且事實一致的證據,從上下文中提取證據的位置更加多樣化,並且能夠生成更相關且一致的摘要。
English
Large language models (LLMs) are capable of generating coherent summaries from very long contexts given a user query. Extracting and properly citing evidence spans could help improve the transparency and reliability of these summaries. At the same time, LLMs suffer from positional biases in terms of which information they understand and attend to, which could affect evidence citation. Whereas previous work has focused on evidence citation with predefined levels of granularity (e.g. sentence, paragraph, document, etc.), we propose the task of long-context query focused summarization with unstructured evidence citation. We show how existing systems struggle to generate and properly cite unstructured evidence from their context, and that evidence tends to be "lost-in-the-middle". To help mitigate this, we create the Summaries with Unstructured Evidence Text dataset (SUnsET), a synthetic dataset generated using a novel domain-agnostic pipeline which can be used as supervision to adapt LLMs to this task. We demonstrate across 5 LLMs of different sizes and 4 datasets with varying document types and lengths that LLMs adapted with SUnsET data generate more relevant and factually consistent evidence than their base models, extract evidence from more diverse locations in their context, and can generate more relevant and consistent summaries.

Summary

AI-Generated Summary

PDF32February 21, 2025