ChatPaper.aiChatPaper

SkillHarness:为计算机使用代理利用安全技能

SkillHarness: Harnessing Safe Skills for Computer-Use Agents

June 2, 2026
作者: Yurun Chen, Biao Yi, Keting Yin, Shengyu Zhang
cs.AI

摘要

计算机使用代理(CUAs)正越来越多地部署在动态交互环境中,这对其在交互过程中进行持续技能学习提出了日益增长的需求。近期方法通过从成功轨迹中学习可复用技能来应对这一挑战,但这些技能学习方法大多假设环境是静态且安全的,忽视了来自对抗性交互(如提示注入)和环境动态变化(如弹窗)带来的风险。在动态环境中,此类假设可能导致有风险的技能学习和脆弱的执行过程,从而损害CUA的可靠性。这就引发了一个问题:CUA如何在动态环境中安全地学习和使用技能?为解决该问题,我们提出了SkillHarness——一种面向动态环境下的安全技能驾驭框架。SkillHarness超越了静态技能抽象,通过将技能学习与利用建模为受安全约束的交互过程来加以实现。具体而言,我们引入技能边界概念,利用多源监督信号从交互轨迹中识别安全技能,并在技能全生命周期中构建自我改进的安全约束。此外,SkillHarness还引入了选择性技能复用机制,引导任务根据上下文进行分解,并通过选择性激活技能子集来完成任务。实验表明,SkillHarness将所学技能的不安全率降低了57.1%,并在动态环境变化下持续提升执行稳定性,优于现有基准方法。
English
Computer-Use Agents (CUAs) are increasingly deployed in dynamic interactive environments, creating a growing need for continual skill learning during interaction. Recent approaches address this challenge by learning reusable skills from successful trajectories. However, these skill learning methods largely assume static and safe environments, overlooking risks from adversarial interactions (e.g., prompt injections) and environmental dynamics (e.g., pop-ups). In dynamic settings, such assumptions can lead to risky skill learning and brittle execution, undermining the reliability of CUAs. This raises the question: how can CUAs learn and use skills safely in dynamic environments? To address this problem, we propose SkillHarness, a framework for safe skill harnessing in dynamic environments. SkillHarness moves beyond static skill abstractions by modeling skill learning and utilization as a safety-constrained interaction process. Specifically, we introduce the skill boundary that leverages multi-source supervision signals to identify safe skills from interaction trajectories, and construct self-improving safety constraints throughout the skill lifecycle. In addition, SkillHarness introduces selective skill reuse, where tasks are guided to decompose according to context and completed through the selective activation of skill subsets. Our experiments demonstrate that SkillHarness significantly reduces the unsafe rate of learned skills by 57.1% and consistently improves execution stability under dynamic environmental changes, outperforming existing baselines.