ChatPaper.aiChatPaper

多模态情境安全

Multimodal Situational Safety

October 8, 2024
作者: Kaiwen Zhou, Chengzhi Liu, Xuandong Zhao, Anderson Compalas, Dawn Song, Xin Eric Wang
cs.AI

摘要

多模态大型语言模型(MLLMs)正在迅速发展,展示出作为多模态助手与人类及其环境互动的令人印象深刻的能力。然而,这种增强的复杂性引入了重大的安全问题。在本文中,我们提出了一项名为多模态情境安全(Multimodal Situational Safety)的新型安全挑战的首次评估和分析,该挑战探讨了基于用户或代理人所参与的具体情境而变化的安全考虑。我们认为,为了安全地回应,无论是通过语言还是行动,MLLM通常需要评估语言查询在其相应的视觉背景中的安全含义。为了评估这种能力,我们开发了多模态情境安全基准(MSSBench)来评估当前MLLM的情境安全性能。该数据集包含1,820个语言查询-图像对,其中一半图像背景是安全的,另一半是不安全的。我们还开发了一个评估框架,分析关键的安全方面,包括显式安全推理、视觉理解以及至关重要的情境安全推理。我们的研究结果显示,当前的MLLM在遵循指示的情境中遇到了这种微妙的安全问题,并且难以一次性解决这些情境安全挑战,突出了未来研究的一个关键领域。此外,我们开发了多代理管道来协同解决安全挑战,这显示出相对于原始MLLM响应的安全性持续改进。代码和数据:mssbench.github.io。
English
Multimodal Large Language Models (MLLMs) are rapidly evolving, demonstrating impressive capabilities as multimodal assistants that interact with both humans and their environments. However, this increased sophistication introduces significant safety concerns. In this paper, we present the first evaluation and analysis of a novel safety challenge termed Multimodal Situational Safety, which explores how safety considerations vary based on the specific situation in which the user or agent is engaged. We argue that for an MLLM to respond safely, whether through language or action, it often needs to assess the safety implications of a language query within its corresponding visual context. To evaluate this capability, we develop the Multimodal Situational Safety benchmark (MSSBench) to assess the situational safety performance of current MLLMs. The dataset comprises 1,820 language query-image pairs, half of which the image context is safe, and the other half is unsafe. We also develop an evaluation framework that analyzes key safety aspects, including explicit safety reasoning, visual understanding, and, crucially, situational safety reasoning. Our findings reveal that current MLLMs struggle with this nuanced safety problem in the instruction-following setting and struggle to tackle these situational safety challenges all at once, highlighting a key area for future research. Furthermore, we develop multi-agent pipelines to coordinately solve safety challenges, which shows consistent improvement in safety over the original MLLM response. Code and data: mssbench.github.io.

Summary

AI-Generated Summary

PDF112November 16, 2024