MUSE-Autoskill：基于技能创建、记忆、管理与评估的自我进化智能体

摘要

大语言模型（LLM）代理依赖可复用技能来解决复杂任务。然而，现有技能创建方法将技能视为孤立、静态的产物，限制了其复用性、可靠性和长期改进能力。我们提出MUSE-Autoskill Agent（记忆驱动技能演化代理），这是一种以技能为核心的代理框架，允许代理通过统一的技能生命周期（创建、记忆、管理、评估与改进）持续提升任务解决能力。该框架使代理能够按需创建技能、跨任务存储与复用技能、高效组织与选择技能，并通过单元测试和运行时反馈对技能进行评估以实现持续改进。我们进一步引入技能级记忆机制，为每个技能积累跨任务经验，从而支持更有效的长期复用与适应。在SkillsBench上的实验初步表明，生命周期管理的技能能够提升任务成功率、效率、复用性及跨代理迁移能力，凸显了将技能作为长期存在、经验感知且可测试资产的重要性。

English

Large language model (LLM) agents rely on reusable skills to solve complex tasks. However, existing skill creation approaches treat skills as isolated and static artifacts, limiting their reusability, reliability, and long-term improvement. We propose MUSE-Autoskill Agent (Memory-Utilizing Skill Evolution), a skill-centric agent framework that lets agents continuously improve their task-solving capability by creating, reusing, and refining skills under a unified lifecycle (creation, memory, management, evaluation, and refinement). Our framework enables agents to create skills on demand, store and reuse them across tasks, organize and select them efficiently, and evaluate them through unit tests and runtime feedback for continuous refinement. We further introduce skill-level memory that accumulates experience for each skill across tasks, enabling more effective reuse and adaptation over time. Experiments on SkillsBench provide initial evidence that lifecycle-managed skills can improve task success, efficiency, reuse, and cross-agent transfer, highlighting the importance of treating skills as long-lived, experience-aware, and testable assets.