ChatPaper.aiChatPaper

COIG-Writer:一个蕴含思维过程的中文创意写作高质量数据集

COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes

October 16, 2025
作者: Yunwen Li, Shuangshuang Ying, Xingwei Qu, Xin Li, Sheng Jin, Minghao Liu, Zhoufutu Wen, Tianyu Zheng, Xeron Du, Qiguang Chen, Jiajun Shi, Wangchunshu Zhou, Jiazhan Feng, Wanjun Zhong, Libo Qin, Stephen Huang, Wanxiang Che, Chenghua Lin, Eli Zhang
cs.AI

摘要

大型语言模型在创意写作方面表现出系统性缺陷,尤其是在非英语语境下,训练数据稀缺且缺乏过程层面的监督。我们提出了COIG-Writer,一个新颖的中文创意写作数据集,通过系统逆向工程高质量文本,捕捉了多样化的输出及其背后的思维过程。与仅提供输入输出对的现有数据集不同,COIG-Writer包含1,665个精心策划的三元组,涵盖51种体裁,每个三元组包含:(1)逆向工程生成的提示,(2)详细记录决策过程的创意推理,以及(3)最终文本。通过全面实验,我们识别出创意写作的两大组成部分:叙事逻辑(由过程监督提供)和语言表达(由通用数据维持)。我们的研究揭示了三个关键发现:(1)过程监督极为有效,但需与通用数据结合以稳定效果。至少每十二个通用样本对应一个创意样本的比例,才能达到最佳性能;低于此阈值,胜率逐渐下降(从62.75%降至35.78%)。(2)创意能力具有文化依赖性,不存在跨语言迁移(中文与英文表现间存在89.26个百分点的差距)。(3)词汇多样性与创意质量呈负相关(TTR悖论),表明高多样性是逻辑缺陷的补偿行为信号。这些发现证实,创意卓越源于逻辑框架与语言基础的相互作用,类似于数学推理在基础模型中增强但无法替代语言能力的情形。
English
Large language models exhibit systematic deficiencies in creative writing, particularly in non-English contexts where training data is scarce and lacks process-level supervision. We present COIG-Writer, a novel Chinese creative writing dataset that captures both diverse outputs and their underlying thought processes through systematic reverse-engineering of high-quality texts. Unlike existing datasets that provide only input-output pairs, COIG-Writer comprises 1,665 meticulously curated triplets spanning 51 genres, each containing: (1) a reverse-engineered prompt, (2) detailed creative reasoning documenting decision-making processes, and (3) the final text. Through comprehensive experiments, we identify a two-component model of creative writing: narrative logic (provided by process supervision) and linguistic expression (maintained by general-purpose data). Our findings reveal three critical insights: (1) Process supervision is highly effective but requires stabilization with general data. A ratio of at least one creative sample to twelve general samples is needed to achieve optimal performance; below this threshold, the win rate progressively degrades (from 62.75% down to 35.78%)., (2) creative capabilities are culturally-bound with no cross-lingual transfer (89.26pp gap between Chinese and English performance), and (3) lexical diversity inversely correlates with creative quality (TTR paradox), suggesting high diversity signals compensatory behavior for logical deficiencies. These findings establish that creative excellence emerges from the interaction between logical scaffolding and linguistic grounding, analogous to how mathematical reasoning enhances but cannot replace linguistic competence in foundation models.
PDF132December 21, 2025