面向心理健康的高效鲁棒性语言情感诊断:基于多智能体指令优化的研究
Towards Efficient and Robust Linguistic Emotion Diagnosis for Mental Health via Multi-Agent Instruction Refinement
January 20, 2026
作者: Jian Zhang, Zhangqi Wang, Zhiyuan Wang, Weiping Fu, Yu He, Haiping Zhu, Qika Lin, Jun Liu
cs.AI
摘要
抑郁、焦虑及创伤相关状态等情绪的语言表达广泛存在于临床记录、咨询对话和在线心理健康社区中,准确识别这些情绪对于临床分诊、风险评估和及时干预至关重要。尽管大语言模型在情绪分析任务中展现出强大的泛化能力,但在高风险、强语境的医疗场景下,其诊断可靠性仍高度依赖提示设计。现有方法面临两大挑战:一是情绪共病现象,即多种交织的情感状态使预测复杂化;二是对临床相关线索的探索效率不足。为此,我们提出APOLO(面向语言情绪诊断的自动化提示优化框架),通过系统探索更广维度、更细粒度的提示空间来提升诊断效率与鲁棒性。APOLO将指令优化建模为部分可观测马尔可夫决策过程,采用规划器、教师、评判者、学生和目标角色的多智能体协作机制。在该闭环框架中,规划器定义优化路径,教师-评判者-学生智能体通过迭代优化提示词提升推理稳定性与有效性,目标智能体则根据性能评估决定是否继续优化。实验结果表明,APOLO在领域特定和分层基准测试中持续提升诊断准确性与鲁棒性,为心理健康领域可信大语言模型应用提供了可扩展、可推广的新范式。
English
Linguistic expressions of emotions such as depression, anxiety, and trauma-related states are pervasive in clinical notes, counseling dialogues, and online mental health communities, and accurate recognition of these emotions is essential for clinical triage, risk assessment, and timely intervention. Although large language models (LLMs) have demonstrated strong generalization ability in emotion analysis tasks, their diagnostic reliability in high-stakes, context-intensive medical settings remains highly sensitive to prompt design. Moreover, existing methods face two key challenges: emotional comorbidity, in which multiple intertwined emotional states complicate prediction, and inefficient exploration of clinically relevant cues. To address these challenges, we propose APOLO (Automated Prompt Optimization for Linguistic Emotion Diagnosis), a framework that systematically explores a broader and finer-grained prompt space to improve diagnostic efficiency and robustness. APOLO formulates instruction refinement as a Partially Observable Markov Decision Process and adopts a multi-agent collaboration mechanism involving Planner, Teacher, Critic, Student, and Target roles. Within this closed-loop framework, the Planner defines an optimization trajectory, while the Teacher-Critic-Student agents iteratively refine prompts to enhance reasoning stability and effectiveness, and the Target agent determines whether to continue optimization based on performance evaluation. Experimental results show that APOLO consistently improves diagnostic accuracy and robustness across domain-specific and stratified benchmarks, demonstrating a scalable and generalizable paradigm for trustworthy LLM applications in mental healthcare.