超越單一世界:多元宇宙情境下角色扮演中超級英雄的基準測試
Beyond One World: Benchmarking Super Heros in Role-Playing Across Multiversal Contexts
October 16, 2025
作者: Perapard Ngokpol, Kun Kerdthaisong, Pasin Buakhaw, Pitikorn Khlaisamniang, Supasate Vorathammathorn, Piyalitt Ittichaiwong, Nutchanon Yongsatianchot
cs.AI
摘要
大型语言模型(LLMs)作为角色扮演代理的应用日益广泛,然而其在忠实且一致地呈现特定版本角色——例如跨越漫画与电影宇宙的超級英雄——方面的能力仍未被充分探索。漫威与DC等超级英雄经典作品提供了丰富的测试平台:数十年的故事叙述孕育了同一角色的多个化身,各自拥有独特的历史、价值观及道德准则。针对此问题,我们引入了“超越单一世界”这一基准,涵盖30位标志性英雄及其90个特定版本的角色扮演。该基准包含两项任务:(i)“经典事件”,考察对关键人生阶段的事实回忆;(ii)“道德困境”,让模型面对充满伦理挑战的情境。我们依据一个将内部思考(“思考”)与外部决策(“行动”)分离的框架,对回答的经典准确性与推理忠实度进行评分。此外,我们提出了“思行匹配”这一指标,量化理由与行动之间的一致性,作为模型可信度的代理。通过对推理导向与非推理导向模型的实验,我们得出三点发现:(1)思维链提示能提升较弱模型的叙事连贯性,但可能降低较强模型的经典准确性;(2)同一角色跨版本泛化仍是一大难题;(3)模型往往擅长于思考或行动之一,但鲜少两者兼备。“超越单一世界”揭示了多元宇宙一致性与推理对齐方面的关键缺口,为角色扮演型LLMs提供了一个极具挑战性的评估标准。
English
Large language models (LLMs) are increasingly used as role-playing agents,
yet their capacity to faithfully and consistently portray version-specific
characters -- for example, superheroes across comic and cinematic universes --
remains underexplored. Superhero canons such as Marvel and DC provide a rich
testbed: decades of storytelling yield multiple incarnations of the same
character with distinct histories, values, and moral codes. To study this
problem, we introduce Beyond One World, a benchmark for character-grounded
roleplay spanning 30 iconic heroes and 90 canon-specific versions. The
benchmark comprises two tasks: (i) Canon Events, which probes factual recall of
pivotal life stages, and (ii) Moral Dilemmas, which confronts models with
ethically charged scenarios. We score responses for canonical accuracy and
reasoning fidelity under a framework that separates internal deliberation
("thinking") from outward decisions ("acting"). We further propose Think-Act
Matching, a metric that quantifies alignment between reasons and actions and
serves as a proxy for model trustworthiness. Experiments across reasoning- and
non-reasoning-oriented models yield three findings: (1) chain-of-thought
prompting improves narrative coherence in weaker models but can reduce
canonical accuracy in stronger ones; (2) cross-version generalization within a
character remains a major obstacle; and (3) models often excel at either
thinking or acting, but rarely both. Beyond One World exposes critical gaps in
multiversal consistency and reasoning alignment, offering a challenging
evaluation for role-playing LLMs.