编织者:创意写作基础模型
Weaver: Foundation Models for Creative Writing
January 30, 2024
作者: Tiannan Wang, Jiamin Chen, Qingrui Jia, Shuai Wang, Ruoyu Fang, Huilin Wang, Zhaowei Gao, Chunzhao Xie, Chuou Xu, Jihong Dai, Yibin Liu, Jialong Wu, Shengwei Ding, Long Li, Zhiwei Huang, Xinle Deng, Teng Yu, Gangan Ma, Han Xiao, Zixin Chen, Danjun Xiang, Yunxia Wang, Yuanyuan Zhu, Yi Xiao, Jing Wang, Yiru Wang, Siran Ding, Jiayang Huang, Jiayi Xu, Yilihamu Tayier, Zhenyu Hu, Yuan Gao, Chengfeng Zheng, Yueshu Ye, Yihang Li, Lei Wan, Xinyue Jiang, Yujie Wang, Siyu Cheng, Zhule Song, Xiangru Tang, Xiaohua Xu, Ningyu Zhang, Huajun Chen, Yuchen Eleanor Jiang, Wangchunshu Zhou
cs.AI
摘要
本文介绍了Weaver,我们首个专注于内容创作的大型语言模型(LLM)系列。Weaver经过精心挑选的语料库进行预训练,重点是提升大型语言模型的写作能力。然后,我们通过一套新颖的方法对Weaver进行微调,用于创意和专业写作,并根据专业作家的偏好进行调整,采用指导数据合成和LLM对齐的方法,使其能够生成更具人类风格的文本,并遵循更多样化的内容创作指令。Weaver系列包括Weaver Mini(1.8B)、Weaver Base(6B)、Weaver Pro(14B)和Weaver Ultra(34B)等不同规模的模型,适用于不同应用,并可根据查询复杂性由路由代理动态分配,以平衡响应质量和计算成本。在精心策划的用于评估LLM写作能力的基准测试中,Weaver各规模的模型表现优于比它们大数倍的通用LLM。值得注意的是,我们最强大的Weaver Ultra模型在各种写作场景中超越了GPT-4,一种最先进的通用LLM,展示了为写作目的训练专门的LLM的优势。此外,Weaver原生支持检索增强生成(RAG)和函数调用(工具使用)。我们展示了这些能力的各种用例,用于改进AI辅助写作系统,包括整合外部知识库、工具或API,并提供个性化写作辅助。此外,我们讨论并总结了预训练和微调领域特定LLM的指南和最佳实践。
English
This work introduces Weaver, our first family of large language models (LLMs)
dedicated to content creation. Weaver is pre-trained on a carefully selected
corpus that focuses on improving the writing capabilities of large language
models. We then fine-tune Weaver for creative and professional writing purposes
and align it to the preference of professional writers using a suit of novel
methods for instruction data synthesis and LLM alignment, making it able to
produce more human-like texts and follow more diverse instructions for content
creation. The Weaver family consists of models of Weaver Mini (1.8B), Weaver
Base (6B), Weaver Pro (14B), and Weaver Ultra (34B) sizes, suitable for
different applications and can be dynamically dispatched by a routing agent
according to query complexity to balance response quality and computation cost.
Evaluation on a carefully curated benchmark for assessing the writing
capabilities of LLMs shows Weaver models of all sizes outperform generalist
LLMs several times larger than them. Notably, our most-capable Weaver Ultra
model surpasses GPT-4, a state-of-the-art generalist LLM, on various writing
scenarios, demonstrating the advantage of training specialized LLMs for writing
purposes. Moreover, Weaver natively supports retrieval-augmented generation
(RAG) and function calling (tool usage). We present various use cases of these
abilities for improving AI-assisted writing systems, including integration of
external knowledge bases, tools, or APIs, and providing personalized writing
assistance. Furthermore, we discuss and summarize a guideline and best
practices for pre-training and fine-tuning domain-specific LLMs.