EIPE文本:用于长篇叙述文本生成的评估引导迭代计划提取
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation
October 12, 2023
作者: Wang You, Wenshan Wu, Yaobo Liang, Shaoguang Mao, Chenfei Wu, Maosong Cao, Yuzhe Cai, Yiduo Guo, Yan Xia, Furu Wei, Nan Duan
cs.AI
摘要
Plan-and-Write是长篇叙事文本生成中常见的分层方法,首先创建计划以指导叙事写作。遵循这一方法,几项研究依赖于简单地提示大型语言模型进行规划,这通常会产生次优结果。在本文中,我们提出了一种名为Evaluation-guided Iterative Plan Extraction for long-form narrative text generation(EIPE-text)的新框架,该框架从叙事语料库中提取计划,并利用提取的计划构建更好的规划器。EIPE-text包括三个阶段:计划提取、学习和推理。在计划提取阶段,它从叙事语料库中迭代提取和改进计划,并构建计划语料库。我们提出了一种基于问题回答(QA)的评估机制,自动评估计划并生成详细的计划细化指导,以指导迭代改进。在学习阶段,我们通过与计划语料库的微调或在计划语料库中的示例中进行上下文学习来构建更好的规划器。最后,我们利用分层方法生成长篇叙事。我们在小说和讲故事领域评估了EIPE-text的有效性。基于GPT-4的评估和人工评估都表明,我们的方法可以生成更连贯和相关的长篇叙事。我们的代码将在未来发布。
English
Plan-and-Write is a common hierarchical approach in long-form narrative text
generation, which first creates a plan to guide the narrative writing.
Following this approach, several studies rely on simply prompting large
language models for planning, which often yields suboptimal results. In this
paper, we propose a new framework called Evaluation-guided Iterative Plan
Extraction for long-form narrative text generation (EIPE-text), which extracts
plans from the corpus of narratives and utilizes the extracted plans to
construct a better planner. EIPE-text has three stages: plan extraction,
learning, and inference. In the plan extraction stage, it iteratively extracts
and improves plans from the narrative corpus and constructs a plan corpus. We
propose a question answer (QA) based evaluation mechanism to automatically
evaluate the plans and generate detailed plan refinement instructions to guide
the iterative improvement. In the learning stage, we build a better planner by
fine-tuning with the plan corpus or in-context learning with examples in the
plan corpus. Finally, we leverage a hierarchical approach to generate long-form
narratives. We evaluate the effectiveness of EIPE-text in the domains of novels
and storytelling. Both GPT-4-based evaluations and human evaluations
demonstrate that our method can generate more coherent and relevant long-form
narratives. Our code will be released in the future.