ChARM:基于角色的自适应奖励建模框架,用于高级角色扮演语言代理
ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents
May 29, 2025
作者: Feiteng Fang, Ting-En Lin, Yuchuan Wu, Xiong Liu, Xiang Huang, Dingwei Chen, Jing Ye, Haonan Zhang, Liang Zhu, Hamid Alinejad-Rokny, Min Yang, Fei Huang, Yongbin Li
cs.AI
摘要
角色扮演语言代理(RPLAs)旨在模拟角色,以实现逼真且引人入胜的人机交互。然而,传统的奖励模型在可扩展性和适应主观对话偏好方面常面临挑战。我们提出了ChARM,一种基于角色的行为自适应奖励模型,通过两项创新应对这些难题:(1)行为自适应边界,显著提升学习效率与泛化能力;(2)自进化机制,利用大规模未标注数据增强训练覆盖范围。此外,我们推出了RoleplayPref,首个专为RPLAs设计的大规模偏好数据集,包含1,108个角色、13个子类别及16,888条双语对话,并配套RoleplayEval专用评估基准。实验结果显示,在偏好排序上较传统Bradley-Terry模型提升了13%。进一步地,将ChARM生成的奖励应用于偏好学习技术(如直接偏好优化)后,在CharacterEval和RoleplayEval上取得了业界领先的成绩。代码与数据集已发布于https://github.com/calubkk/ChARM。
English
Role-Playing Language Agents (RPLAs) aim to simulate characters for realistic
and engaging human-computer interactions. However, traditional reward models
often struggle with scalability and adapting to subjective conversational
preferences. We propose ChARM, a Character-based Act-adaptive Reward Model,
addressing these challenges through two innovations: (1) an act-adaptive margin
that significantly enhances learning efficiency and generalizability, and (2) a
self-evolution mechanism leveraging large-scale unlabeled data to improve
training coverage. Additionally, we introduce RoleplayPref, the first
large-scale preference dataset specifically for RPLAs, featuring 1,108
characters, 13 subcategories, and 16,888 bilingual dialogues, alongside
RoleplayEval, a dedicated evaluation benchmark. Experimental results show a 13%
improvement over the conventional Bradley-Terry model in preference rankings.
Furthermore, applying ChARM-generated rewards to preference learning techniques
(e.g., direct preference optimization) achieves state-of-the-art results on
CharacterEval and RoleplayEval. Code and dataset are available at
https://github.com/calubkk/ChARM.Summary
AI-Generated Summary