PLANNER: 潜在言語拡散モデルによる多様なパラグラフ生成

要旨

テキスト生成のための自己回帰モデルは、生成ステップ中に誤差が蓄積するため、繰り返しが多く品質の低い出力を生成することがあります。この問題は、モデルの学習方法と推論時の使用方法の違いである「露出バイアス」に起因するとされています。ノイズ除去拡散モデルは、モデルが出力を再訪して修正できる代替アプローチを提供します。しかし、これらのモデルは計算コストが高く、特に長文や段落において、自己回帰モデルと比較して流暢性の低い出力を生成する傾向があります。本論文では、潜在意味拡散と自己回帰生成を組み合わせたPLANNERモデルを提案し、段落全体をグローバルに制御しながら流暢なテキストを生成します。このモデルは、自己回帰的な「デコード」モジュールと、粗密な方法で意味的段落埋め込みを生成する「プランニング」モジュール（潜在拡散を使用）を組み合わせることでこれを実現します。提案手法は、様々な条件付き生成タスクで評価され、意味生成、テキスト補完、要約において、高品質な長文テキストを効率的に生成する効果が示されています。

English

Autoregressive models for text sometimes generate repetitive and low-quality output because errors accumulate during the steps of generation. This issue is often attributed to exposure bias - the difference between how a model is trained, and how it is used during inference. Denoising diffusion models provide an alternative approach in which a model can revisit and revise its output. However, they can be computationally expensive and prior efforts on text have led to models that produce less fluent output compared to autoregressive models, especially for longer text and paragraphs. In this paper, we propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation, to generate fluent text while exercising global control over paragraphs. The model achieves this by combining an autoregressive "decoding" module with a "planning" module that uses latent diffusion to generate semantic paragraph embeddings in a coarse-to-fine manner. The proposed method is evaluated on various conditional generation tasks, and results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text in an efficient manner.

PLANNER: 潜在言語拡散モデルによる多様なパラグラフ生成

PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model

要旨

Support