能干却粗心:计算机使用代理是否遵循情境完整性?
Capable but Careless: Do Computer-Use Agents Follow Contextual Integrity?
June 22, 2026
作者: Anmol Goel, Iryna Gurevych
cs.AI
摘要
计算机使用代理(CUA)如今能代表用户处理各类个人应用,如电子邮件、日历和待办事项清单。这种跨应用访问虽具实用性,却带来了一个长期被忽视的隐私风险:当代理在一个场景中工作时,可能从另一场景调用不合适的上下文信息。为此,我们提出了AgentCIBench评估框架,将这一风险转化为可执行、可确定性评分的情景测试。我们聚焦于CUA的三种典型故障模式:视觉共置泄露(代理在UI中拉取位于任务目标旁的禁用项)、任务模糊性过度共享(代理在回应模糊指令时倾泻大量个人状态信息)、以及接收者错位(代理向不匹配收件人发送不当内容)。通过对15个前沿代理的评估,我们发现了惊人的高失败率:其中11个代理在超过50%的测试场景中发生数据泄露,平均泄露率达67.9%,且当代理在真实环境中端到端执行任务时,同样的故障依然存在。我们开源AgentCIBench,旨在推动更安全的计算机使用代理开发,并将上下文披露测试确立为部署前的安全审查标准。
English
Computer-use agents (CUAs) now act on a user's behalf across personal applications such as email, calendars, and to-do lists. This cross-application access is useful, but it also creates a privacy risk that has been largely overlooked: when an agent works in one context, it can pull in information from another that is inappropriate in that context. Hence, we introduce AgentCIBench, an evaluation harness that turns this risk into executable, deterministically scored scenarios. We target three common failure modes in CUAs: visual co-location, where the agent pulls in prohibited items that sit next to the task target in the UI; task-ambiguity overshare, where the agent dumps dense personal state in response to an under-specified prompt; and recipient misalignment, where the agent sends content to an addressee for whom it is inappropriate. We evaluate 15 frontier agents and find a surprisingly high failure rate: 11 of 15 leak on more than 50% of scenarios, with an average leakage of 67.9%, and the same failures persist when agents act end-to-end in the environment to complete the task. We release AgentCIBench to encourage the development of safer computer-use agents and position contextual disclosure testing as a pre-deployment safety check.