AgentHijack: コンピュータ利用エージェントの一般的な環境破損に対するロバスト性のベンチマーク評価

要旨

マルチモーダル大規模言語モデル（MLLM）を基盤とする自律型コンピュータ操作エージェントは、複雑なデジタルワークフローを遂行する有能なアシスタントとして台頭している。しかし、実世界の実行環境は理想とは程遠く、ポップアップ、解像度の変更、競合アプリケーションがエージェントの知覚と制御に頻繁に干渉する。本稿では、動的環境における不確実性が直接的な敵対的意図なしに実行フローを妨害する一般的な外乱下でのコンピュータ操作エージェントのロバスト性を評価するためのベンチマークであるAgentHijackを導入する。具体的には、AgentHijackは現実的な不完全シナリオを再現するために9種類の設定可能な一般的な外乱を導入する。MLLMベースのエージェントを活用した様々なデスクトップタスクを評価した結果、軽微な外乱であってもパフォーマンスが大幅に低下することが判明し、これはエージェントの脆弱性を強調し、ロバスト性評価の必要性を浮き彫りにする。その後、強化されたグラウンディング能力を持つ行動生成器と、行動の要約及び環境チェックを担当する監視役を統合したフレームワークであるAgentHijack-Agentを提案する。広範な実験によりその有効性を検証する。コード、環境、ベースラインモデル、データはhttps://AgentHijack.github.ioで公開している。

English

Autonomous computer use agents that powered by multimodal large language models (MLLMs) are emerging as capable assistants for completing complex digital workflows. However, real-world execution environments are far from ideal: pop-ups, resolution changes, and competing applications frequently interfere with agent perception and control. We introduce AgentHijack, a benchmark designed to evaluate the robustness of computer-use agents under common corruptions, where the uncertainties in dynamic environment disrupt the execution flow without direct adversarial intent. Specifically, AgentHijack introduces 9 configurable common corruptions to replicate realistic imperfect scenarios. We evaluate a variety of desktop tasks that utilize MLLM-based agents and discover that even minor instances of corruption can result in substantial performance degradation, which emphasizes the fragility of agents and underscores the necessity of robustness evaluation. Afterward, we propose AgentHijack-Agent, a framework that integrates an action generator with enhanced grounding capabilities and an onlooker responsible for behavior summarization and environment checking. Extensive experiments validate its effectiveness. Our code, environment, baseline models and data are publicly available at: https://AgentHijack.github.io.