ChatPaper.aiChatPaper

MOA:角色扮演代理的多目标对齐框架

MOA: Multi-Objective Alignment for Role-Playing Agents

December 10, 2025
作者: Chonghua Liao, Ke Wang, Yuchuan Wu, Fei Huang, Yongbin Li
cs.AI

摘要

角色扮演智能體(RPAs)需同時掌握多項衝突技能——遵循多輪對話指令、展現領域知識並保持一致的語言風格。現有研究要么依賴監督微調(SFT)導致過度擬合表面線索而降低多樣性,要么應用強化學習(RL)卻難以實現多維度的全面RPA優化。我們提出MOA(多目標對齊)框架,這是一種支援通用RPAs進行多維度細粒度評分優化的強化學習方案。MOA引入創新的多目標優化策略,通過同步訓練多個細粒度評分維度來提升優化效能。此外,為解決模型輸出多樣性與品質問題,我們還採用了具備離策略指導的思維增強推演技術。在PersonaGym和RoleMRC等挑戰性基準測試中的大量實驗表明,MOA能使80億參數模型在多個維度上媲美甚至超越GPT-4o和Claude等強基線模型,這證實了MOA在構建同時滿足角色知識、人物風格、多樣化場景及複雜多輪對話需求的RPAs方面具有巨大潛力。
English
Role-playing agents (RPAs) must simultaneously master many conflicting skills -- following multi-turn instructions, exhibiting domain knowledge, and adopting a consistent linguistic style. Existing work either relies on supervised fine-tuning (SFT) that over-fits surface cues and yields low diversity, or applies reinforcement learning (RL) that fails to learn multiple dimensions for comprehensive RPA optimization. We present MOA (Multi-Objective Alignment), a reinforcement-learning framework that enables multi-dimensional, fine-grained rubric optimization for general RPAs. MOA introduces a novel multi-objective optimization strategy that trains simultaneously on multiple fine-grained rubrics to boost optimization performance. Besides, to address the issues of model output diversity and quality, we have also employed thought-augmented rollout with off-policy guidance. Extensive experiments on challenging benchmarks such as PersonaGym and RoleMRC show that MOA enables an 8B model to match or even outperform strong baselines such as GPT-4o and Claude across numerous dimensions. This demonstrates the great potential of MOA in building RPAs that can simultaneously meet the demands of role knowledge, persona style, diverse scenarios, and complex multi-turn conversations.
PDF11December 13, 2025