SkCC:面向跨框架LLM智能体的可移植且安全技能编译
SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents
May 5, 2026
作者: Yipeng Ouyang, Yi Xiao, Yuhao Gu, Xianwei Zhang
cs.AI
摘要
大语言模型智能体已演化为执行复杂任务的自主系统,其中SKILL.md规范已成为封装智能体能力的事实标准。然而,一个关键瓶颈依然存在:不同智能体框架对提示格式的敏感性存在显著差异,导致性能波动高达40%,而几乎所有技能仅以单一、格式无关的Markdown版本存在。人工跨平台重写带来了不可持续的维护负担,此前审计发现超过三分之一的社区技能包含安全漏洞。为此,我们提出SkCC——一个将经典编译器设计引入智能体技能开发的编译框架。其核心是SkIR——一种强类型中间表示——它将技能语义与平台特定格式解耦,支持在异构智能体框架间进行可移植部署。围绕此中间表示,编译时分析器在部署前通过反技能注入机制强制执行安全约束。通过四阶段流水线,SkCC将适配复杂度从O(m×n)降低到O(m+n)。在SkillsBench上的实验表明,编译后的技能在性能上始终优于原始版本,在Claude Code上将通过率从21.1%提升至33.3%,在Kimi CLI上从35.1%提升至48.7%,同时实现了低于10毫秒的编译延迟、94.8%的主动安全触发率,以及在各个平台上节省10%至46%的运行时令牌消耗。
English
LLM-Agents have evolved into autonomous systems for complex task execution, with the SKILL.md specification emerging as a de facto standard for encapsulating agent capabilities. However, a critical bottleneck remains: different agent frameworks exhibit starkly different sensitivities to prompt formatting, causing up to 40% performance variation, yet nearly all skills exist as a single, format-agnostic Markdown version. Manual per-platform rewriting creates an unsustainable maintenance burden, while prior audits have found that over one third of community skills contain security vulnerabilities. To address this, we present SkCC, a compilation framework that introduces classical compiler design into agent skill development. At its core, SkIR - a strongly-typed intermediate representation - decouples skill semantics from platform-specific formatting, enabling portable deployment across heterogeneous agent frameworks. Around this IR, a compile-time Analyzer enforces security constraints via Anti-Skill Injection before deployment. Through a four-phase pipeline, SkCC reduces adaptation complexity from O(m times n) to O(m + n). Experiments on SkillsBench demonstrate that compiled skills consistently outperform their original counterparts, improving pass rates from 21.1% to 33.3% on Claude Code and from 35.1% to 48.7% on Kimi CLI, while achieving sub-10ms compilation latency, a 94.8% proactive security trigger rate, and 10-46% runtime token savings across platforms.