单轮对话中多重支持策略的建模研究：面向情感支持对话

摘要

情感支持对话（ESC）旨在通过生成共情式支持性对话来帮助处于困境的个体。现有研究通常默认每个支持话轮仅对应单一策略，而现实中的支持性交流往往在单次发言中融合多种策略。本文通过将ESC任务重构为多策略话语生成——每个话语可包含一个或多个策略-回应对，重新审视该任务。我们提出两种生成方法：All-in-One（单步解码预测所有策略-回应对）和One-by-One（迭代生成策略-回应对直至完成），两种方法均采用强化学习引导的认知推理来优化策略选择与回应组织。在ESConv数据集上的实验表明，我们的方法能有效建模多策略话语，显著提升支持质量与对话成功率。据我们所知，本研究首次系统论证了在单次发言中融合多种支持策略对情感支持对话具有可行性和优越性。所有代码与数据将公开于https://github.com/aliyun/qwen-dianjin。

English

Emotional Support Conversation (ESC) aims to assist individuals experiencing distress by generating empathetic and supportive dialogue. While prior work typically assumes that each supporter turn corresponds to a single strategy, real-world supportive communication often involves multiple strategies within a single utterance. In this paper, we revisit the ESC task by formulating it as multi-strategy utterance generation, where each utterance may contain one or more strategy-response pairs. We propose two generation methods: All-in-One, which predicts all strategy-response pairs in a single decoding step, and One-by-One, which iteratively generates strategy-response pairs until completion. Both methods are further enhanced with cognitive reasoning guided by reinforcement learning to improve strategy selection and response composition. We evaluate our models on the ESConv dataset under both utterance-level and dialogue-level settings. Experimental results show that our methods effectively model multi-strategy utterances and lead to improved supportive quality and dialogue success. To our knowledge, this work provides the first systematic empirical evidence that allowing multiple support strategies within a single utterance is both feasible and beneficial for emotional support conversations. All code and data will be publicly available at https://github.com/aliyun/qwen-dianjin.