ChatPaper.aiChatPaper

骆驼也能用电脑:计算机使用代理的系统级安全方案

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

January 14, 2026
作者: Hanna Foerster, Robert Mullins, Tom Blanchard, Nicolas Papernot, Kristina Nikolić, Florian Tramèr, Ilia Shumailov, Cheng Zhang, Yiren Zhao
cs.AI

摘要

AI智能体易受提示注入攻击,恶意内容可通过劫持智能体行为窃取凭证或造成经济损失。目前唯一已知的有效防御方案是采用架构隔离策略,将可信任务规划与不可信环境观察严格分离。然而将该设计应用于计算机使用智能体(CUAs)——即通过观察屏幕并执行操作来实现任务自动化的系统——存在根本性挑战:现有智能体需持续观察用户界面状态以确定每个动作,这与安全所需的隔离要求相冲突。我们通过论证用户界面工作流虽具动态性但结构可预测,成功化解了这一矛盾。本文提出单次规划框架,使可信规划器在接触任何潜在恶意内容前,即可生成包含条件分支的完整执行图谱,从而针对任意指令注入提供可验证的控制流完整性保障。尽管架构隔离能有效防范指令注入,但我们发现仍需额外措施来防御分支导向攻击——该攻击通过操纵界面元素触发计划内的非预期有效路径。我们在OSWorld环境中的评估表明,该方案在保持前沿模型57%性能的同时,可将小型开源模型的性能提升达19%,有力验证了CUA系统可实现严格安全性与实用性的统一。
English
AI agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss. The only known robust defense is architectural isolation that strictly separates trusted task planning from untrusted environment observations. However, applying this design to Computer Use Agents (CUAs) -- systems that automate tasks by viewing screens and executing actions -- presents a fundamental challenge: current agents require continuous observation of UI state to determine each action, conflicting with the isolation required for security. We resolve this tension by demonstrating that UI workflows, while dynamic, are structurally predictable. We introduce Single-Shot Planning for CUAs, where a trusted planner generates a complete execution graph with conditional branches before any observation of potentially malicious content, providing provable control flow integrity guarantees against arbitrary instruction injections. Although this architectural isolation successfully prevents instruction injections, we show that additional measures are needed to prevent Branch Steering attacks, which manipulate UI elements to trigger unintended valid paths within the plan. We evaluate our design on OSWorld, and retain up to 57% of the performance of frontier models while improving performance for smaller open-source models by up to 19%, demonstrating that rigorous security and utility can coexist in CUAs.
PDF21January 17, 2026