SurveySum:一个用于将多篇科学文章总结为调查部分的数据集
SurveySum: A Dataset for Summarizing Multiple Scientific Articles into a Survey Section
August 29, 2024
作者: Leandro Carísio Fernandes, Gustavo Bartz Guedes, Thiago Soares Laitz, Thales Sales Almeida, Rodrigo Nogueira, Roberto Lotufo, Jayr Pereira
cs.AI
摘要
文档摘要是将文本缩短为简洁且信息丰富摘要的任务。本文介绍了一个新颖的数据集,旨在将多篇科学文章总结成调查部分。我们的贡献包括:(1)SurveySum,一个新数据集,填补了领域特定摘要工具的空白;(2)两个特定流程,用于将科学文章总结为调查部分;以及(3)使用多种指标评估这些流程,比较它们的性能。我们的结果突出了高质量检索阶段的重要性,以及不同配置对生成摘要质量的影响。
English
Document summarization is a task to shorten texts into concise and
informative summaries. This paper introduces a novel dataset designed for
summarizing multiple scientific articles into a section of a survey. Our
contributions are: (1) SurveySum, a new dataset addressing the gap in
domain-specific summarization tools; (2) two specific pipelines to summarize
scientific articles into a section of a survey; and (3) the evaluation of these
pipelines using multiple metrics to compare their performance. Our results
highlight the importance of high-quality retrieval stages and the impact of
different configurations on the quality of generated summaries.Summary
AI-Generated Summary