角色悖论：医学角色作为临床语言模型的行为先验

摘要

角色设定可视为大型语言模型（LLM）的行为先验，通常被假定能以单调递增的方式赋予专业能力并提升安全性。然而，其对高风险临床决策的影响仍缺乏系统研究。我们通过多维评估框架（涵盖任务准确性、校准度和安全相关风险行为），系统评估了临床LLM中基于角色的控制机制，探究专业角色（如急诊科医生、护士）和交互风格（果敢型vs谨慎型）如何影响不同模型在医疗任务中的行为表现。研究发现存在系统性、情境依赖性且非单调的效应：医疗角色能提升急危重症护理任务的表现（准确性和校准度最高增益达约+20%），但在初级诊疗场景中反而会使性能出现相当程度的下降。交互风格虽能调节风险倾向与敏感度，但其效果高度依赖模型特性。尽管LLM评判器的综合排名显示在安全关键案例中医疗角色优于非医疗角色，但人类临床医生在安全合规性上仅呈现中等一致性（平均科恩κ系数=0.43），且对其95.9%的推理质量回答表示低置信度。本研究证实角色设定作为行为先验会引发情境依赖的权衡，而非安全性与专业性的绝对保障。代码详见https://github.com/rsinghlab/Persona_Paradox。

English

Persona conditioning can be viewed as a behavioral prior for large language models (LLMs) and is often assumed to confer expertise and improve safety in a monotonic manner. However, its effects on high-stakes clinical decision-making remain poorly characterized. We systematically evaluate persona-based control in clinical LLMs, examining how professional roles (e.g., Emergency Department physician, nurse) and interaction styles (bold vs.\ cautious) influence behavior across models and medical tasks. We assess performance on clinical triage and patient-safety tasks using multidimensional evaluations that capture task accuracy, calibration, and safety-relevant risk behavior. We find systematic, context-dependent, and non-monotonic effects: Medical personas improve performance in critical care tasks, yielding gains of up to sim+20% in accuracy and calibration, but degrade performance in primary-care settings by comparable margins. Interaction style modulates risk propensity and sensitivity, but it's highly model-dependent. While aggregated LLM-judge rankings favor medical over non-medical personas in safety-critical cases, we found that human clinicians show moderate agreement on safety compliance (average Cohen's κ= 0.43) but indicate a low confidence in 95.9\% of their responses on reasoning quality. Our work shows that personas function as behavioral priors that introduce context-dependent trade-offs rather than guarantees of safety or expertise. The code is available at https://github.com/rsinghlab/Persona\_Paradox.

角色悖论：医学角色作为临床语言模型的行为先验

The Persona Paradox: Medical Personas as Behavioral Priors in Clinical Language Models

摘要

Support