PEARL：使用生成校准的检索器个性化大型语言模型写作助手

摘要

强大的大型语言模型促进了写作助手的发展，承诺显著提高作文和沟通的质量和效率。然而，有效辅助的一个障碍是大型语言模型输出缺乏对作者沟通风格和专业知识的个性化。本文通过提出PEARL来解决这一挑战，这是一个使用生成校准的检索增强型大型语言模型写作助手，实现个性化。我们的检索器经过训练，选择历史用户撰写的文档进行提示增强，从而最有可能为用户请求最佳个性化大型语言模型生成。我们提出了两个训练检索器的关键创新：1）识别可能受益于个性化的用户请求和提供该益处的文档的训练数据选择方法；2）尺度校准的KL-散度目标，确保我们的检索器紧密跟踪文档对个性化生成的益处。我们展示了PEARL在生成个性化的工作场所社交媒体帖子和Reddit评论方面的有效性。最后，我们展示了生成校准的检索器作为性能预测器的潜力，通过大型语言模型串联进一步改善低质量生成。

English

Powerful large language models have facilitated the development of writing assistants that promise to significantly improve the quality and efficiency of composition and communication. However, a barrier to effective assistance is the lack of personalization in LLM outputs to the author's communication style and specialized knowledge. In this paper, we address this challenge by proposing PEARL, a retrieval-augmented LLM writing assistant personalized with a generation-calibrated retriever. Our retriever is trained to select historic user-authored documents for prompt augmentation, such that they are likely to best personalize LLM generations for a user request. We propose two key novelties for training our retriever: 1) A training data selection method that identifies user requests likely to benefit from personalization and documents that provide that benefit; and 2) A scale-calibrating KL-divergence objective that ensures that our retriever closely tracks the benefit of a document for personalized generation. We demonstrate the effectiveness of PEARL in generating personalized workplace social media posts and Reddit comments. Finally, we showcase the potential of a generation-calibrated retriever to double as a performance predictor and further improve low-quality generations via LLM chaining.

PEARL：使用生成校准的检索器个性化大型语言模型写作助手

PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers

摘要

Support