ChatPaper.aiChatPaper

野外环境下的代理技能:大规模安全漏洞实证研究

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

January 15, 2026
作者: Yi Liu, Weizhe Wang, Ruitao Feng, Yao Zhang, Guangquan Xu, Gelei Deng, Yuekang Li, Leo Zhang
cs.AI

摘要

AI智能体框架的兴起带来了智能体技能——这种包含指令与可执行代码的模块化组件能动态扩展智能体能力。尽管该架构支持强大的定制化功能,但技能在执行时默认享有隐式信任且缺乏严格审查,形成了重要却未被充分认知的攻击面。我们针对这一新兴生态开展了首次大规模实证安全分析,从两大主流市场收集了42,447项技能,并运用SkillScan多阶段检测框架(整合静态分析与基于LLM的语义分类)系统检测了31,132项技能。研究发现普遍存在安全风险:26.1%的技能存在至少一种漏洞,涵盖提示词注入、数据窃取、权限提升和供应链风险四大类共14种攻击模式。数据窃取(13.3%)和权限提升(11.8%)最为普遍,5.2%的技能呈现出强烈暗示恶意意图的高危模式。分析显示,捆绑可执行脚本的技能存在漏洞的概率是纯指令技能的2.12倍(OR=2.12, p<0.001)。本研究的贡献包括:(1)基于8,126个漏洞技能构建的实证漏洞分类体系;(2)经验证达到86.7%精确率与82.5%召回率的检测方法;(3)支持后续研究的开源数据集与检测工具包。这些结果表明,在此攻击向量被大规模利用前,亟需建立基于能力的权限系统与强制性安全审查机制。
English
The rise of AI agent frameworks has introduced agent skills, modular packages containing instructions and executable code that dynamically extend agent capabilities. While this architecture enables powerful customization, skills execute with implicit trust and minimal vetting, creating a significant yet uncharacterized attack surface. We conduct the first large-scale empirical security analysis of this emerging ecosystem, collecting 42,447 skills from two major marketplaces and systematically analyzing 31,132 using SkillScan, a multi-stage detection framework integrating static analysis with LLM-based semantic classification. Our findings reveal pervasive security risks: 26.1% of skills contain at least one vulnerability, spanning 14 distinct patterns across four categories: prompt injection, data exfiltration, privilege escalation, and supply chain risks. Data exfiltration (13.3%) and privilege escalation (11.8%) are most prevalent, while 5.2% of skills exhibit high-severity patterns strongly suggesting malicious intent. We find that skills bundling executable scripts are 2.12x more likely to contain vulnerabilities than instruction-only skills (OR=2.12, p<0.001). Our contributions include: (1) a grounded vulnerability taxonomy derived from 8,126 vulnerable skills, (2) a validated detection methodology achieving 86.7% precision and 82.5% recall, and (3) an open dataset and detection toolkit to support future research. These results demonstrate an urgent need for capability-based permission systems and mandatory security vetting before this attack vector is further exploited.
PDF31January 17, 2026