OS-Sentinel:基于现实工作流混合验证的安全增强型移动端GUI智能体
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
October 28, 2025
作者: Qiushi Sun, Mukai Li, Zhoumianze Liu, Zhihui Xie, Fangzhi Xu, Zhangyue Yin, Kanzhi Cheng, Zehao Li, Zichen Ding, Qi Liu, Zhiyong Wu, Zhuosheng Zhang, Ben Kao, Lingpeng Kong
cs.AI
摘要
基于视觉语言模型(VLM)驱动的计算机操作智能体已在移动平台等数字环境中展现出类人的操作能力。尽管这些智能体在推动数字化自动化方面前景广阔,但其可能引发的系统入侵、隐私泄露等不安全操作风险正引发严重关切。在移动环境广阔而复杂的操作空间中检测这些安全隐患,仍是一个亟待深入探索的重大挑战。为奠定移动智能体安全研究的基础,我们推出MobileRisk-Live动态沙箱环境及配套的安全检测基准,该基准包含带有细粒度标注的真实操作轨迹。基于此,我们提出OS-Sentinel新型混合安全检测框架,通过将检测显性系统违规的形式化验证器与评估情境风险及智能体行为的VLM情境判断器相结合,实现协同检测。实验表明,OS-Sentinel在多项指标上较现有方法提升10%-30%。深入分析为开发更安全可靠的自主移动智能体提供了关键洞见。
English
Computer-using agents powered by Vision-Language Models (VLMs) have
demonstrated human-like capabilities in operating digital environments like
mobile platforms. While these agents hold great promise for advancing digital
automation, their potential for unsafe operations, such as system compromise
and privacy leakage, is raising significant concerns. Detecting these safety
concerns across the vast and complex operational space of mobile environments
presents a formidable challenge that remains critically underexplored. To
establish a foundation for mobile agent safety research, we introduce
MobileRisk-Live, a dynamic sandbox environment accompanied by a safety
detection benchmark comprising realistic trajectories with fine-grained
annotations. Built upon this, we propose OS-Sentinel, a novel hybrid safety
detection framework that synergistically combines a Formal Verifier for
detecting explicit system-level violations with a VLM-based Contextual Judge
for assessing contextual risks and agent actions. Experiments show that
OS-Sentinel achieves 10%-30% improvements over existing approaches across
multiple metrics. Further analysis provides critical insights that foster the
development of safer and more reliable autonomous mobile agents.