RiOSWorld：マルチモーダルコンピュータ利用エージェントのリスク評価ベンチマーク

要旨

マルチモーダル大規模言語モデル（MLLM）の急速な発展に伴い、複雑なコンピュータタスクを達成可能な自律的なコンピュータ利用エージェントとしての展開が進んでいます。しかし、重要な課題が浮上しています：対話シナリオ向けに設計・調整された一般的なMLLMの安全性リスク原則は、現実世界のコンピュータ利用シナリオに効果的に転用できるのでしょうか？既存のMLLMベースのコンピュータ利用エージェントの安全性リスク評価に関する研究には、現実的なインタラクティブ環境の欠如や、特定のリスクタイプに限定された焦点など、いくつかの制約があります。これらの制約は、現実世界の環境の複雑性、多様性、変動性を無視しており、コンピュータ利用エージェントの包括的なリスク評価を制限しています。この問題に対処するため、我々はRiOSWorldを導入しました。これは、現実世界のコンピュータ操作中におけるMLLMベースのエージェントの潜在的なリスクを評価するためのベンチマークです。我々のベンチマークは、ウェブ、ソーシャルメディア、マルチメディア、OS、メール、オフィスソフトウェアなど、さまざまなコンピュータアプリケーションにわたる492のリスクタスクを含んでいます。これらのリスクは、リスクの発生源に基づいて2つの主要なクラスに分類されます：（i）ユーザー起因のリスクと（ii）環境起因のリスク。評価においては、安全性リスクを2つの観点から評価します：（i）リスク目標の意図と（ii）リスク目標の達成。RiOSWorldにおけるマルチモーダルエージェントを用いた広範な実験により、現在のコンピュータ利用エージェントが現実世界のシナリオで重大な安全性リスクに直面していることが示されました。我々の研究結果は、現実世界のコンピュータ操作におけるコンピュータ利用エージェントの安全性調整の必要性と緊急性を強調し、信頼できるコンピュータ利用エージェントの開発に貴重な洞察を提供します。我々のベンチマークは、https://yjyddq.github.io/RiOSWorld.github.io/ で公開されています。

English

With the rapid development of multimodal large language models (MLLMs), they are increasingly deployed as autonomous computer-use agents capable of accomplishing complex computer tasks. However, a pressing issue arises: Can the safety risk principles designed and aligned for general MLLMs in dialogue scenarios be effectively transferred to real-world computer-use scenarios? Existing research on evaluating the safety risks of MLLM-based computer-use agents suffers from several limitations: it either lacks realistic interactive environments, or narrowly focuses on one or a few specific risk types. These limitations ignore the complexity, variability, and diversity of real-world environments, thereby restricting comprehensive risk evaluation for computer-use agents. To this end, we introduce RiOSWorld, a benchmark designed to evaluate the potential risks of MLLM-based agents during real-world computer manipulations. Our benchmark includes 492 risky tasks spanning various computer applications, involving web, social media, multimedia, os, email, and office software. We categorize these risks into two major classes based on their risk source: (i) User-originated risks and (ii) Environmental risks. For the evaluation, we evaluate safety risks from two perspectives: (i) Risk goal intention and (ii) Risk goal completion. Extensive experiments with multimodal agents on RiOSWorld demonstrate that current computer-use agents confront significant safety risks in real-world scenarios. Our findings highlight the necessity and urgency of safety alignment for computer-use agents in real-world computer manipulation, providing valuable insights for developing trustworthy computer-use agents. Our benchmark is publicly available at https://yjyddq.github.io/RiOSWorld.github.io/.

RiOSWorld：マルチモーダルコンピュータ利用エージェントのリスク評価ベンチマーク

RiOSWorld: Benchmarking the Risk of Multimodal Compter-Use Agents

要旨

Support