探究大型音频语言模型在说话者情绪变化下的安全漏洞
Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations
October 19, 2025
作者: Bo-Han Feng, Chien-Feng Liu, Yu-Hsuan Li Liang, Chih-Kai Yang, Szu-Wei Fu, Zhehuai Chen, Ke-Han Lu, Sung-Feng Huang, Chao-Han Huck Yang, Yu-Chiang Frank Wang, Yun-Nung Chen, Hung-yi Lee
cs.AI
摘要
大型音语模型(LALMs)在基于文本的大语言模型基础上扩展了听觉理解能力,为多模态应用开辟了新路径。尽管其感知、推理与任务执行能力已得到广泛研究,但副语言变异下的安全对齐问题仍待深入探索。本研究系统考察了说话者情绪的作用,构建了包含多种情绪及强度表达的恶意语音指令数据集,并对多个前沿LALMs进行评估。结果揭示显著的安全不一致性:不同情绪会引发不同程度的非安全响应,且强度影响呈非单调性,中等强度表达往往构成最大风险。这些发现凸显了LALMs中被忽视的脆弱性,呼吁建立专门针对情绪变异鲁棒性的对齐策略,这是实现现实场景可信部署的必要前提。
English
Large audio-language models (LALMs) extend text-based LLMs with auditory
understanding, offering new opportunities for multimodal applications. While
their perception, reasoning, and task performance have been widely studied,
their safety alignment under paralinguistic variation remains underexplored.
This work systematically investigates the role of speaker emotion. We construct
a dataset of malicious speech instructions expressed across multiple emotions
and intensities, and evaluate several state-of-the-art LALMs. Our results
reveal substantial safety inconsistencies: different emotions elicit varying
levels of unsafe responses, and the effect of intensity is non-monotonic, with
medium expressions often posing the greatest risk. These findings highlight an
overlooked vulnerability in LALMs and call for alignment strategies explicitly
designed to ensure robustness under emotional variation, a prerequisite for
trustworthy deployment in real-world settings.