從統計方法到大型語言模型:文本摘要的系統性調查
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
June 17, 2024
作者: Haopeng Zhang, Philip S. Yu, Jiawei Zhang
cs.AI
摘要
隨著深度神經網絡、預訓練語言模型(PLMs)和最近的大型語言模型(LLMs)的出現,文本摘要研究經歷了幾次重大轉變。因此,本調查全面回顧了通過這些範式轉變的文本摘要研究進展和演變。它分為兩個主要部分:(1)在LLM時代之前對數據集、評估指標和摘要方法進行詳細概述,包括傳統統計方法、深度學習方法和PLM微調技術,以及(2)對LLM時代最新進展在基準設定、建模和評估摘要方面的首次詳細檢驗。通過綜合現有文獻並提供一個連貫的概述,本調查還討論了研究趨勢、開放挑戰並提出摘要研究中有前景的研究方向,旨在引導研究人員穿越摘要研究的不斷發展的領域。
English
Text summarization research has undergone several significant transformations
with the advent of deep neural networks, pre-trained language models (PLMs),
and recent large language models (LLMs). This survey thus provides a
comprehensive review of the research progress and evolution in text
summarization through the lens of these paradigm shifts. It is organized into
two main parts: (1) a detailed overview of datasets, evaluation metrics, and
summarization methods before the LLM era, encompassing traditional statistical
methods, deep learning approaches, and PLM fine-tuning techniques, and (2) the
first detailed examination of recent advancements in benchmarking, modeling,
and evaluating summarization in the LLM era. By synthesizing existing
literature and presenting a cohesive overview, this survey also discusses
research trends, open challenges, and proposes promising research directions in
summarization, aiming to guide researchers through the evolving landscape of
summarization research.Summary
AI-Generated Summary