テキスト要約の体系的な調査：統計的手法から大規模言語モデルまで

要旨

テキスト要約研究は、深層ニューラルネットワーク、事前学習済み言語モデル（PLM）、そして最近の大規模言語モデル（LLM）の登場により、いくつかの重要な変革を遂げてきた。本調査では、これらのパラダイムシフトを通じて、テキスト要約における研究の進展と進化を包括的にレビューする。本調査は主に2つの部分で構成されている：(1) LLM時代以前のデータセット、評価指標、要約手法の詳細な概観であり、伝統的な統計的手法、深層学習アプローチ、PLMのファインチューニング技術を網羅している。(2) LLM時代におけるベンチマーキング、モデリング、要約評価の最近の進展を初めて詳細に検証する。既存の文献を統合し、一貫した概観を提示することで、本調査は要約研究のトレンド、未解決の課題についても議論し、有望な研究方向性を提案する。これにより、進化し続ける要約研究の風景を研究者に案内することを目指している。

English

Text summarization research has undergone several significant transformations with the advent of deep neural networks, pre-trained language models (PLMs), and recent large language models (LLMs). This survey thus provides a comprehensive review of the research progress and evolution in text summarization through the lens of these paradigm shifts. It is organized into two main parts: (1) a detailed overview of datasets, evaluation metrics, and summarization methods before the LLM era, encompassing traditional statistical methods, deep learning approaches, and PLM fine-tuning techniques, and (2) the first detailed examination of recent advancements in benchmarking, modeling, and evaluating summarization in the LLM era. By synthesizing existing literature and presenting a cohesive overview, this survey also discusses research trends, open challenges, and proposes promising research directions in summarization, aiming to guide researchers through the evolving landscape of summarization research.

テキスト要約の体系的な調査：統計的手法から大規模言語モデルまで

A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models

要旨

Support