SkillAdaptor: 基于轨迹的LLM智能体自适应技能

摘要

大语言模型（LLM）智能体日益依赖可重用的外部技能来解决长时程交互任务。现有的免训练技能适配流程通常基于完整轨迹或会话级反馈来更新技能，这使得失败归因较为粗糙，往往产生不稳定或过于宽泛的修订。我们提出SkillAdaptor——一种具有显式失败归因的免训练步骤级技能适配框架，可无缝接入OpenClaw类智能体框架。面对失败轨迹时，SkillAdaptor识别首个可操作的故障步骤，将责任关联至候选技能，并在显式接受校验下执行针对性更新，同时保持主干网络冻结。我们在WebShop、PinchBench和Claw-Eval三个套件上，基于Kimi-K2.5、GLM-5和GPT-5.2进行评估。SkillAdaptor在所有三个套件上均优于无技能和技能适配基线，其中单指标最大提升分别为：PinchBench平均得分率+1.5个百分点，Claw-Eval平均得分+1.8，WebShop成功率+1.7。这些结果表明，步骤级归因支持更稳定且可审计的免训练技能维护。代码将在https://github.com/zjunlp/SkillAdaptor公开。

English

Large language model (LLM) agents increasingly rely on reusable external skills to solve long-horizon interactive tasks. Existing training-free skill adaptation pipelines usually update skills from full trajectories or session-level feedback, which makes failure attribution coarse and often produces unstable or overly broad revisions. We propose SkillAdaptor, a training-free step-level skill adaptation framework with explicit failure attribution, and it can plug into OpenClaw-class agent harnesses. Given a failed trajectory, SkillAdaptor identifies a first actionable fault step, links responsibility to candidate skills, and applies targeted updates under explicit acceptance checks while keeping the backbone frozen. We evaluate on WebShop, PinchBench, and Claw-Eval with Kimi-K2.5, GLM-5, and GPT-5.2. SkillAdaptor improves over no-skill and skill-adaptation baselines on all three suites, with the largest single-metric improvements of +1.5 points on PinchBench Avg Score%, +1.8 on Claw-Eval Avg Score, and +1.7 on WebShop success rate. These results indicate that step-level attribution supports more stable and auditable training-free skill maintenanceThe code will be released at https://github.com/zjunlp/SkillAdaptor..