OS-Sentinel:透過現實工作流程中的混合驗證實現安全強化的行動GUI代理
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
October 28, 2025
作者: Qiushi Sun, Mukai Li, Zhoumianze Liu, Zhihui Xie, Fangzhi Xu, Zhangyue Yin, Kanzhi Cheng, Zehao Li, Zichen Ding, Qi Liu, Zhiyong Wu, Zhuosheng Zhang, Ben Kao, Lingpeng Kong
cs.AI
摘要
基於視覺語言模型驅動的電腦操作代理,已在行動平台等數位環境中展現出類人的操作能力。儘管這類代理在推動數位自動化方面前景廣闊,但其可能引發系統入侵、隱私洩漏等不安全操作的潛在風險正引發重大關注。在行動環境廣闊而複雜的操作空間中檢測這些安全隱患,仍是一項亟待深入探索的重大挑戰。為奠定行動代理安全研究的基礎,我們推出MobileRisk-Live動態沙箱環境,並配套建立包含精細標註真實操作軌跡的安全檢測基準。基於此,我們提出OS-Sentinel新型混合安全檢測框架,該框架通過形式化驗證器檢測顯性系統層級違規,並結合基於VLM的上下文判別器評估情境風險與代理行為,實現協同防護。實驗表明,OS-Sentinel在多項指標上較現有方法提升10%-30%。深入分析更為開發更安全可靠的自動化行動代理提供了關鍵洞見。
English
Computer-using agents powered by Vision-Language Models (VLMs) have
demonstrated human-like capabilities in operating digital environments like
mobile platforms. While these agents hold great promise for advancing digital
automation, their potential for unsafe operations, such as system compromise
and privacy leakage, is raising significant concerns. Detecting these safety
concerns across the vast and complex operational space of mobile environments
presents a formidable challenge that remains critically underexplored. To
establish a foundation for mobile agent safety research, we introduce
MobileRisk-Live, a dynamic sandbox environment accompanied by a safety
detection benchmark comprising realistic trajectories with fine-grained
annotations. Built upon this, we propose OS-Sentinel, a novel hybrid safety
detection framework that synergistically combines a Formal Verifier for
detecting explicit system-level violations with a VLM-based Contextual Judge
for assessing contextual risks and agent actions. Experiments show that
OS-Sentinel achieves 10%-30% improvements over existing approaches across
multiple metrics. Further analysis provides critical insights that foster the
development of safer and more reliable autonomous mobile agents.