教授LLM个性化技能 -- 受写作教育启发的方法
Teach LLMs to Personalize -- An Approach inspired by Writing Education
August 15, 2023
作者: Cheng Li, Mingyang Zhang, Qiaozhu Mei, Yaqing Wang, Spurthi Amba Hombaiah, Yi Liang, Michael Bendersky
cs.AI
摘要
个性化文本生成是一个新兴的研究领域,近年来引起了广泛关注。在这个方向上的大多数研究都集中在通过设计定制特征或模型来专注于特定领域。在这项工作中,我们提出了一种使用大型语言模型(LLMs)进行个性化文本生成的通用方法。受到写作教育实践的启发,我们开发了一个多阶段和多任务的框架,用于教导LLMs进行个性化生成。在写作指导中,从来源撰写的任务通常被分解为涉及查找、评估、总结、综合和整合信息的多个步骤。类似地,我们的个性化文本生成方法包括多个阶段:检索、排名、总结、综合和生成。此外,我们引入了一个多任务设置,帮助模型进一步提高其生成能力,这受到了教育中观察到的一个现象的启发,即学生的阅读能力和写作能力通常是相关的。我们在三个公共数据集上评估了我们的方法,每个数据集涵盖了不同和具有代表性的领域。我们的结果显示,相对于各种基线,我们取得了显著的改进。
English
Personalized text generation is an emerging research area that has attracted
much attention in recent years. Most studies in this direction focus on a
particular domain by designing bespoke features or models. In this work, we
propose a general approach for personalized text generation using large
language models (LLMs). Inspired by the practice of writing education, we
develop a multistage and multitask framework to teach LLMs for personalized
generation. In writing instruction, the task of writing from sources is often
decomposed into multiple steps that involve finding, evaluating, summarizing,
synthesizing, and integrating information. Analogously, our approach to
personalized text generation consists of multiple stages: retrieval, ranking,
summarization, synthesis, and generation. In addition, we introduce a multitask
setting that helps the model improve its generation ability further, which is
inspired by the observation in education that a student's reading proficiency
and writing ability are often correlated. We evaluate our approach on three
public datasets, each of which covers a different and representative domain.
Our results show significant improvements over a variety of baselines.