CoEdIT：タスク特化型指示チューニングによるテキスト編集

要旨

テキスト編集や推敲は、人間の執筆プロセスにおける重要な機能です。高品質な推敲を行い、人間の書き手と協働するための大規模言語モデル（LLM）の能力を理解することは、効果的な執筆アシスタントを構築するための重要なステップです。LLMと指示チューニングのこれまでの成功を踏まえ、私たちは指示チューニングされたLLMをテキスト推敲に活用し、ユーザー生成テキストの品質向上とプロセスの効率化を図ります。本論文では、執筆支援のための最先端のテキスト編集モデル「CoEdIT」を紹介します。CoEdITは、ユーザーから「文をよりシンプルにする」や「より中立的なスタイルで書く」といった望ましいテキストの属性を指定する指示を受け取り、編集されたテキストを出力します。私たちは、テキスト編集のための多様なタスク固有の指示（合計82Kの指示）に基づいてファインチューニングされた大規模言語モデルを提示します。私たちのモデルは、(1) 様々なテキスト編集ベンチマークで最先端の性能を達成し、(2) 指示に基づいてトレーニングされた公開されている最大規模のLLMと競争力がありながら、その60分の1のサイズであり、(3) 未見の編集指示に一般化する能力を持ち、(4) 異なる編集アクションの組み合わせを含む指示に一般化するための構成的理解能力を示します。詳細な定性的および定量的分析を通じて、書き手が他の最先端のテキスト編集モデルと比較してCoEdITが提案する編集を好むことを示します。私たちのコードとデータセットは公開されています。

English

Text editing or revision is an essential function of the human writing process. Understanding the capabilities of LLMs for making high-quality revisions and collaborating with human writers is a critical step toward building effective writing assistants. With the prior success of LLMs and instruction tuning, we leverage instruction-tuned LLMs for text revision to improve the quality of user-generated text and improve the efficiency of the process. We introduce CoEdIT, a state-of-the-art text editing model for writing assistance. CoEdIT takes instructions from the user specifying the attributes of the desired text, such as "Make the sentence simpler" or "Write it in a more neutral style," and outputs the edited text. We present a large language model fine-tuned on a diverse collection of task-specific instructions for text editing (a total of 82K instructions). Our model (1) achieves state-of-the-art performance on various text editing benchmarks, (2) is competitive with publicly available largest-sized LLMs trained on instructions while being sim60x smaller, (3) is capable of generalizing to unseen edit instructions, and (4) exhibits compositional comprehension abilities to generalize to instructions containing different combinations of edit actions. Through extensive qualitative and quantitative analysis, we show that writers prefer the edits suggested by CoEdIT, relative to other state-of-the-art text editing models. Our code and dataset are publicly available.

CoEdIT：タスク特化型指示チューニングによるテキスト編集

CoEdIT: Text Editing by Task-Specific Instruction Tuning

要旨

Support