ChatPaper.aiChatPaper

CUA-技能:培养计算机使用代理的技能

CUA-Skill: Develop Skills for Computer Using Agent

January 28, 2026
作者: Tianyi Chen, Yinheng Li, Michael Solodko, Sen Wang, Nan Jiang, Tingyuan Cui, Junheng Hao, Jongwoo Ko, Sara Abdali, Suzhen Zheng, Leon Xu, Hao Fan, Pashmina Cameron, Justin Wagle, Kazuhito Koishida
cs.AI

摘要

计算机使用智能体(CUAs)旨在自主操作计算机系统以完成现实世界任务。然而,现有智能体系统仍难以规模化且性能落后于人类。关键限制在于缺乏可复用、结构化的技能抽象,这些抽象应能捕捉人类与图形用户界面的交互方式以及如何利用这些技能。我们推出CUA-Skill——一个将人类计算机使用知识编码为技能的计算智能体技能库,该库集成了参数化执行流程与组合图谱。作为覆盖常见Windows应用程序的大规模精细化技能集合,CUA-Skill为可扩展、高可靠的智能体开发提供了实用基础设施与工具基底。基于此技能库,我们构建了支持动态技能检索、参数实例化及记忆感知故障恢复的端到端计算机使用智能体CUA-Skill Agent。实验结果表明,在具有挑战性的端到端智能体基准测试中,CUA-Skill显著提升了执行成功率和鲁棒性,为未来计算机使用智能体发展奠定了坚实基础。在WindowsAgentArena测试平台上,CUA-Skill Agent以57.5%的三次最佳成功率刷新纪录,同时较现有及同期方法实现显著效能提升。项目页面详见https://microsoft.github.io/cua_skill/。
English
Computer-Using Agents (CUAs) aim to autonomously operate computer systems to complete real-world tasks. However, existing agentic systems remain difficult to scale and lag behind human performance. A key limitation is the absence of reusable and structured skill abstractions that capture how humans interact with graphical user interfaces and how to leverage these skills. We introduce CUA-Skill, a computer-using agentic skill base that encodes human computer-use knowledge as skills coupled with parameterized execution and composition graphs. CUA-Skill is a large-scale library of carefully engineered skills spanning common Windows applications, serving as a practical infrastructure and tool substrate for scalable, reliable agent development. Built upon this skill base, we construct CUA-Skill Agent, an end-to-end computer-using agent that supports dynamic skill retrieval, argument instantiation, and memory-aware failure recovery. Our results demonstrate that CUA-Skill substantially improves execution success rates and robustness on challenging end-to-end agent benchmarks, establishing a strong foundation for future computer-using agent development. On WindowsAgentArena, CUA-Skill Agent achieves state-of-the-art 57.5% (best of three) successful rate while being significantly more efficient than prior and concurrent approaches. The project page is available at https://microsoft.github.io/cua_skill/.
PDF132March 12, 2026