文本摘要的系统调查:从统计方法到大型语言模型
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
June 17, 2024
作者: Haopeng Zhang, Philip S. Yu, Jiawei Zhang
cs.AI
摘要
随着深度神经网络、预训练语言模型(PLMs)和最近的大型语言模型(LLMs)的出现,文本摘要研究经历了几次重大转变。因此,本调查全面审视了文本摘要研究在这些范式转变的视角下的进展和演变。本调查分为两个主要部分:(1)LLM 时代之前数据集、评估指标和摘要方法的详细概述,涵盖传统统计方法、深度学习方法和PLM 微调技术,以及(2)对LLM 时代摘要基准、建模和评估最新进展的首次详细审查。通过综合现有文献并提供一致的概述,本调查还讨论了研究趋势、面临的挑战,并提出了摘要研究中有前景的研究方向,旨在引导研究人员穿越摘要研究的不断演变的领域。
English
Text summarization research has undergone several significant transformations
with the advent of deep neural networks, pre-trained language models (PLMs),
and recent large language models (LLMs). This survey thus provides a
comprehensive review of the research progress and evolution in text
summarization through the lens of these paradigm shifts. It is organized into
two main parts: (1) a detailed overview of datasets, evaluation metrics, and
summarization methods before the LLM era, encompassing traditional statistical
methods, deep learning approaches, and PLM fine-tuning techniques, and (2) the
first detailed examination of recent advancements in benchmarking, modeling,
and evaluating summarization in the LLM era. By synthesizing existing
literature and presenting a cohesive overview, this survey also discusses
research trends, open challenges, and proposes promising research directions in
summarization, aiming to guide researchers through the evolving landscape of
summarization research.Summary
AI-Generated Summary