从技能文本到技能结构：智能体技能的调度-结构-逻辑表征

摘要

LLM智能体日益依赖可复用技能——这种能力包整合了指令、控制流、约束条件和工具调用。然而在目前大多数智能体系统中，技能仍以文本密集型形式呈现，包括SKILL.md风格文档和结构化记录，其机器可用的核心信息大多嵌于自然语言描述中。这给以技能为核心的智能体系统带来挑战：管理技能集合和运用技能支持智能体，都需要对调用接口、执行结构和具体副作用进行推理，而这些要素往往混杂在单一文本表层之下。因此，显式表征技能知识可能有助于机器更易获取和利用这些信息。借鉴尚克和艾贝尔森在语言知识表征领域的经典工作——记忆组织包、脚本理论及概念依存理论，我们提出了据我们所知首个能解耦技能级调度信号、场景级执行结构、逻辑级动作与资源使用证据的智能体技能结构化表征方法：调度-结构-逻辑（SSL）表征框架。我们基于LLM实现SSL规范化器，并在技能发现和风险评估两个任务上对技能库进行评估，结果显著优于纯文本基线：技能发现任务中MRR从0.573提升至0.707；风险评估任务中宏观F1分数从0.744提升至0.787。这些发现表明，基于来源的显式结构化表征使智能体技能更易于检索和审查。这也说明SSL最好被理解为向更可检验、可复用、具操作性的智能体技能表征迈出的实践步骤，而非最终标准或端到端的技能管理使用机制。

English

LLM agents increasingly rely on reusable skills, capability packages that combine instructions, control flow, constraints, and tool calls. In most current agent systems, however, skills are still represented by text-heavy artifacts, including SKILL.md-style documents and structured records whose machine-usable evidence remains embedded largely in natural-language descriptions. This poses a challenge for skill-centered agent systems: managing skill collections and using skills to support agent both require reasoning over invocation interfaces, execution structure, and concrete side effects that are often entangled in a single textual surface. An explicit representation of skill knowledge may therefore help make these artifacts easier for machines to acquire and leverage. Drawing on Memory Organization Packets, Script Theory, and Conceptual Dependency from Schank and Abelson's classical work on linguistic knowledge representation, we introduce what is, to our knowledge, the first structured representation for agent skill artifacts that disentangles skill-level scheduling signals, scene-level execution structure, and logic-level action and resource-use evidence: the Scheduling-Structural-Logical (SSL) representation. We instantiate SSL with an LLM-based normalizer and evaluate it on a corpus of skills in two tasks, Skill Discovery and Risk Assessment, and superiorly outperform the text-only baselines: in Skill Discovery, SSL improves MRR from 0.573 to 0.707; in Risk Assessment, it improves macro F1 from 0.744 to 0.787. These findings reveal that explicit, source-grounded structure makes agent skills easier to search and review. They also suggest that SSL is best understood as a practical step toward more inspectable, reusable, and operationally actionable skill representations for agent systems, rather than as a finished standard or an end-to-end mechanism for managing and using skills.

从技能文本到技能结构：智能体技能的调度-结构-逻辑表征

From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills

摘要

Support