スキルテキストからスキル構造へ：エージェントスキルのためのスケジューリング・構造的・論理的表現

要旨

LLMエージェントは、指示、制御フロー、制約、ツール呼び出しを組み合わせた能力パッケージである再利用可能なスキルに依存する機会が増えています。しかし、現在のほとんどのエージェントシステムでは、スキルは依然としてテキスト量の多い成果物によって表現されており、SKILL.md形式の文書や、機械が利用可能な証拠が自然言語の記述に埋め込まれた構造化レコードが含まれます。これはスキル中心のエージェントシステムに課題をもたらします。スキルコレクションの管理とエージェント支援のためのスキル利用の両方には、呼び出しインターフェース、実行構造、具体的な副作用に対する推論が必要ですが、これらは単一のテキスト表面に絡み合っていることが多いためです。したがって、スキル知識の明示的な表現は、これらの成果物を機械が習得し活用することを容易にする可能性があります。SchankとAbelsonの言語的知識表現に関する古典的研究であるメモリ組織パケット、スクリプト理論、概念的依存性に着想を得て、我々はスキルレベルのスケジューリング信号、シーンレベルの実行構造、ロジックレベルのアクションおよびリソース使用証拠を分離する、知る限り初の構造化スキル表現手法を提案します：スケジューリング・構造的・論理的（SSL）表現です。我々はLLMベースの正規化器でSSLを実装し、スキル発見とリスク評価の2つのタスクにおけるスキルコーパスで評価した結果、テキストのみのベースラインを大幅に上回りました：スキル発見ではMRRを0.573から0.707に、リスク評価ではmacro F1を0.744から0.787に改善しました。これらの結果は、明示的でソースに根ざした構造がエージェントスキルの検索とレビューを容易にすることを示しています。また、SSLは完成された標準やスキル管理・使用のためのエンドツーエンド機構というよりも、エージェントシステムのための検査可能性・再利用性・実用的な実行可能性が高いスキル表現に向けた実践的な一歩として捉えるべきであることを示唆しています。

English

LLM agents increasingly rely on reusable skills, capability packages that combine instructions, control flow, constraints, and tool calls. In most current agent systems, however, skills are still represented by text-heavy artifacts, including SKILL.md-style documents and structured records whose machine-usable evidence remains embedded largely in natural-language descriptions. This poses a challenge for skill-centered agent systems: managing skill collections and using skills to support agent both require reasoning over invocation interfaces, execution structure, and concrete side effects that are often entangled in a single textual surface. An explicit representation of skill knowledge may therefore help make these artifacts easier for machines to acquire and leverage. Drawing on Memory Organization Packets, Script Theory, and Conceptual Dependency from Schank and Abelson's classical work on linguistic knowledge representation, we introduce what is, to our knowledge, the first structured representation for agent skill artifacts that disentangles skill-level scheduling signals, scene-level execution structure, and logic-level action and resource-use evidence: the Scheduling-Structural-Logical (SSL) representation. We instantiate SSL with an LLM-based normalizer and evaluate it on a corpus of skills in two tasks, Skill Discovery and Risk Assessment, and superiorly outperform the text-only baselines: in Skill Discovery, SSL improves MRR from 0.573 to 0.707; in Risk Assessment, it improves macro F1 from 0.744 to 0.787. These findings reveal that explicit, source-grounded structure makes agent skills easier to search and review. They also suggest that SSL is best understood as a practical step toward more inspectable, reusable, and operationally actionable skill representations for agent systems, rather than as a finished standard or an end-to-end mechanism for managing and using skills.

スキルテキストからスキル構造へ：エージェントスキルのためのスケジューリング・構造的・論理的表現

From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills

要旨

Support