SkillRL:通过递归技能增强强化学习进化智能体
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning
February 9, 2026
作者: Peng Xia, Jianwen Chen, Hanyang Wang, Jiaqi Liu, Kaide Zeng, Yu Wang, Siwei Han, Yiyang Zhou, Xujiang Zhao, Haifeng Chen, Zeyu Zheng, Cihang Xie, Huaxiu Yao
cs.AI
摘要
大型语言模型(LLM)智能体在复杂任务中展现出惊人能力,但其往往孤立运作,难以从历史经验中学习。现有基于记忆的方法主要存储原始任务轨迹,这些轨迹通常存在冗余且包含大量噪声,阻碍了智能体提取对泛化至关重要的高层次可复用行为模式。本文提出SkillRL框架,通过自动技能发现与递归进化机制,在原始经验与策略提升之间建立桥梁。我们的方法引入基于经验的蒸馏机制构建分层技能库SkillBank,采用自适应检索策略获取通用与任务特定启发式规则,并设计递归进化机制使技能库能在强化学习过程中与智能体策略协同演化。这些创新显著降低了标记占用规模,同时提升了推理效用。在ALFWorld、WebShop及七项搜索增强任务上的实验表明,SkillRL实现了最先进性能,以超过基线方法15.3%的优势领先,并在任务复杂度增加时保持稳健性。代码已开源:https://github.com/aiming-lab/SkillRL。
English
Large Language Model (LLM) agents have shown stunning results in complex tasks, yet they often operate in isolation, failing to learn from past experiences. Existing memory-based methods primarily store raw trajectories, which are often redundant and noise-heavy. This prevents agents from extracting high-level, reusable behavioral patterns that are essential for generalization. In this paper, we propose SkillRL, a framework that bridges the gap between raw experience and policy improvement through automatic skill discovery and recursive evolution. Our approach introduces an experience-based distillation mechanism to build a hierarchical skill library SkillBank, an adaptive retrieval strategy for general and task-specific heuristics, and a recursive evolution mechanism that allows the skill library to co-evolve with the agent's policy during reinforcement learning. These innovations significantly reduce the token footprint while enhancing reasoning utility. Experimental results on ALFWorld, WebShop and seven search-augmented tasks demonstrate that SkillRL achieves state-of-the-art performance, outperforming strong baselines over 15.3% and maintaining robustness as task complexity increases. Code is available at this https://github.com/aiming-lab/SkillRL.