ChatPaper.aiChatPaper

從統計方法到大型語言模型:文本摘要的系統性調查

A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models

June 17, 2024
作者: Haopeng Zhang, Philip S. Yu, Jiawei Zhang
cs.AI

摘要

隨著深度神經網絡、預訓練語言模型(PLMs)和最近的大型語言模型(LLMs)的出現,文本摘要研究經歷了幾次重大轉變。因此,本調查全面回顧了通過這些範式轉變的文本摘要研究進展和演變。它分為兩個主要部分:(1)在LLM時代之前對數據集、評估指標和摘要方法進行詳細概述,包括傳統統計方法、深度學習方法和PLM微調技術,以及(2)對LLM時代最新進展在基準設定、建模和評估摘要方面的首次詳細檢驗。通過綜合現有文獻並提供一個連貫的概述,本調查還討論了研究趨勢、開放挑戰並提出摘要研究中有前景的研究方向,旨在引導研究人員穿越摘要研究的不斷發展的領域。
English
Text summarization research has undergone several significant transformations with the advent of deep neural networks, pre-trained language models (PLMs), and recent large language models (LLMs). This survey thus provides a comprehensive review of the research progress and evolution in text summarization through the lens of these paradigm shifts. It is organized into two main parts: (1) a detailed overview of datasets, evaluation metrics, and summarization methods before the LLM era, encompassing traditional statistical methods, deep learning approaches, and PLM fine-tuning techniques, and (2) the first detailed examination of recent advancements in benchmarking, modeling, and evaluating summarization in the LLM era. By synthesizing existing literature and presenting a cohesive overview, this survey also discusses research trends, open challenges, and proposes promising research directions in summarization, aiming to guide researchers through the evolving landscape of summarization research.
PDF52December 2, 2024