SurveySum:一個用於將多篇科學文章總結成調查部分的數據集
SurveySum: A Dataset for Summarizing Multiple Scientific Articles into a Survey Section
August 29, 2024
作者: Leandro Carísio Fernandes, Gustavo Bartz Guedes, Thiago Soares Laitz, Thales Sales Almeida, Rodrigo Nogueira, Roberto Lotufo, Jayr Pereira
cs.AI
摘要
文件摘要是將文本縮短為簡潔且資訊豐富的摘要的任務。本文介紹了一個新的數據集,旨在將多篇科學文章總結為一篇調查的部分。我們的貢獻包括:(1) SurveySum,一個新的數據集,解決了領域特定摘要工具的差距;(2) 兩個特定的流程,用於將科學文章總結為調查的部分;以及(3) 使用多個指標評估這些流程,以比較它們的性能。我們的結果突顯了高質量檢索階段的重要性,以及不同配置對生成摘要質量的影響。
English
Document summarization is a task to shorten texts into concise and
informative summaries. This paper introduces a novel dataset designed for
summarizing multiple scientific articles into a section of a survey. Our
contributions are: (1) SurveySum, a new dataset addressing the gap in
domain-specific summarization tools; (2) two specific pipelines to summarize
scientific articles into a section of a survey; and (3) the evaluation of these
pipelines using multiple metrics to compare their performance. Our results
highlight the importance of high-quality retrieval stages and the impact of
different configurations on the quality of generated summaries.Summary
AI-Generated Summary