技能爪：让技能在智能进化器中集体演进

摘要

诸如OpenClaw之类的大型语言模型（LLM）智能体依赖可复用技能执行复杂任务，但这些技能在部署后基本保持静态。这导致相似的工作流程、工具使用模式和故障模式在不同用户间被反复重新发现，阻碍了系统通过经验实现自我改进。虽然不同用户的交互行为能提供关于技能何时有效或失效的互补信号，现有系统却缺乏将此类异构经验转化为可靠技能更新的机制。为此，我们提出SkillClaw——一个支持多用户智能体生态系统中集体技能演进的框架，该框架将跨用户、跨时段的交互视为改进技能的核心信号。SkillClaw持续聚合使用过程中产生的行为轨迹，通过自主进化器进行处理：该组件识别重复出现的行为模式，并将其转化为技能集的更新——既包括对现有技能的优化，也涵盖通过新增能力实现的扩展。更新后的技能将保存在共享仓库中并向所有用户同步，使得某一场景下的改进能零成本地实现全系统传播。通过将多用户经验融入持续技能更新，SkillClaw实现了跨用户知识传递和累积性能力提升。在WildClawBench上的实验表明，即使仅基于有限的交互与反馈，该框架也能显著提升Qwen3-Max在真实智能体场景中的性能表现。

English

Large language model (LLM) agents such as OpenClaw rely on reusable skills to perform complex tasks, yet these skills remain largely static after deployment. As a result, similar workflows, tool usage patterns, and failure modes are repeatedly rediscovered across users, preventing the system from improving with experience. While interactions from different users provide complementary signals about when a skill works or fails, existing systems lack a mechanism to convert such heterogeneous experiences into reliable skill updates. To address these issues, we present SkillClaw, a framework for collective skill evolution in multi-user agent ecosystems, which treats cross-user and over-time interactions as the primary signal for improving skills. SkillClaw continuously aggregates trajectories generated during use and processes them with an autonomous evolver, which identifies recurring behavioral patterns and translates them into updates to the skill set by refining existing skills or extending them with new capabilities. The resulting skills are maintained in a shared repository and synchronized across users, allowing improvements discovered in one context to propagate system-wide while requiring no additional effort from users. By integrating multi-user experience into ongoing skill updates, SkillClaw enables cross-user knowledge transfer and cumulative capability improvement, and experiments on WildClawBench show that limited interaction and feedback, it significantly improves the performance of Qwen3-Max in real-world agent scenarios.