ChatPaper.aiChatPaper

CUA-技能:培养计算机使用代理的技能

CUA-Skill: Develop Skills for Computer Using Agent

January 28, 2026
作者: Tianyi Chen, Yinheng Li, Michael Solodko, Sen Wang, Nan Jiang, Tingyuan Cui, Junheng Hao, Jongwoo Ko, Sara Abdali, Suzhen Zheng, Leon Xu, Hao Fan, Pashmina Cameron, Justin Wagle, Kazuhito Koishida
cs.AI

摘要

電腦使用代理(CUA)旨在自主操作電腦系統以完成現實世界任務。然而,現有代理系統仍難以擴展且性能落後於人類。關鍵侷限在於缺乏可重複使用的結構化技能抽象,這些抽象應能捕捉人類與圖形用戶界面的互動方式及技能運用方法。我們提出CUA-Skill——一個將人類電腦使用知識編碼為技能庫的電腦使用代理技能庫,其技能附帶參數化執行流程與組合圖譜。該技能庫作為大規模精心設計的Windows常用應用程序技能集合,為可擴展、高可靠性的代理開發提供實用基礎設施與工具基底。基於此技能庫,我們構建了CUA-Skill代理——支持動態技能檢索、參數實例化與記憶感知故障恢復的端到端電腦使用代理。實驗結果表明,CUA-Skill在具挑戰性的端到端代理基準測試中顯著提升執行成功率與魯棒性,為未來電腦使用代理發展奠定堅實基礎。在WindowsAgentArena測試中,CUA-Skill代理以57.5%(三次最佳)的成功率達到現有最優水平,同時顯著優於先前及同期方法的執行效率。項目頁面請訪問:https://microsoft.github.io/cua_skill/。
English
Computer-Using Agents (CUAs) aim to autonomously operate computer systems to complete real-world tasks. However, existing agentic systems remain difficult to scale and lag behind human performance. A key limitation is the absence of reusable and structured skill abstractions that capture how humans interact with graphical user interfaces and how to leverage these skills. We introduce CUA-Skill, a computer-using agentic skill base that encodes human computer-use knowledge as skills coupled with parameterized execution and composition graphs. CUA-Skill is a large-scale library of carefully engineered skills spanning common Windows applications, serving as a practical infrastructure and tool substrate for scalable, reliable agent development. Built upon this skill base, we construct CUA-Skill Agent, an end-to-end computer-using agent that supports dynamic skill retrieval, argument instantiation, and memory-aware failure recovery. Our results demonstrate that CUA-Skill substantially improves execution success rates and robustness on challenging end-to-end agent benchmarks, establishing a strong foundation for future computer-using agent development. On WindowsAgentArena, CUA-Skill Agent achieves state-of-the-art 57.5% (best of three) successful rate while being significantly more efficient than prior and concurrent approaches. The project page is available at https://microsoft.github.io/cua_skill/.
PDF132March 12, 2026