LatentSkill：从上下文文本技能到LLM智能体权重内潜在技能

摘要

智能体系统越来越多地使用文本技能来编码可重复使用的任务流程，但在每一步将这些技能注入提示中会带来显著的上下文开销，并将技能内容以明文形式暴露。我们提出LatentSkill框架，通过预训练的超网络将文本技能转换为即插即用的LoRA适配器。LatentSkill将技能知识存储在权重空间而非上下文空间，在消除每步技能令牌的同时，保留了模块化加载、缩放和组合的能力。在ALFWorld和Search-QA上，LatentSkill在显著减少预填充令牌的情况下，超越了对应的上下文技能基线：在ALFWorld的已知和未知划分上，成功率分别提升21.4和13.4个百分点，预填充令牌减少64.1%；在Search-QA上，精确匹配提升3.0个百分点，技能令牌开销降低72.2%。进一步分析表明，生成的技能LoRA形成了结构化的语义几何形状，可通过LoRA缩放系数精确控制，并且在技能组件对齐时可通过参数空间算术进行组合。这些发现表明，权重空间技能为扩展LLM智能体提供了一种高效、模块化且更少暴露的基底。

English

Agent systems increasingly use textual skills to encode reusable task procedures, but injecting these skills into the prompt at every step incurs substantial context overhead and exposes skill content as plaintext. We present LatentSkill, a framework that converts textual skills into plug-and-play LoRA adapters through a pretrained hypernetwork. LatentSkill stores skill knowledge in weight space rather than context space, removing per-step skill tokens while preserving modular loading, scaling, and composition. On ALFWorld and Search-QA, LatentSkill outperforms the corresponding in-context skill baseline while using substantially fewer prefill tokens: it improves ALFWorld success by 21.4 and 13.4 points on the seen and unseen splits with 64.1% fewer prefill tokens, and improves Search-QA exact match by 3.0 points with 72.2% lower skill-token overhead. Further analysis shows that generated skill LoRAs form a structured semantic geometry, can be precisely controlled via the LoRA scaling coefficient, and can be composed through parameter-space arithmetic when skill components are aligned. These findings suggest that weight-space skills provide an efficient, modular, and less exposed substrate for extending LLM agents.