ChatPaper.aiChatPaper

技能教练:基于免训练GRPO的自我进化式技能优化器

Skills-Coach: A Self-Evolving Skill Optimizer via Training-Free GRPO

April 30, 2026
作者: Yu Tian, Jiawei Chen, Lifan Zheng, Mingxiang Tao, Xinyi Zeng, Zhaoxia Yin, Hang Su, Xian Sun
cs.AI

摘要

我们推出Skills-Coach——一种创新的自动化框架,旨在显著增强基于大语言模型(LLM)智能体中技能的自我进化能力。针对当前技能生态系统的碎片化问题,该框架通过探索技能能力的边界,为实现智能应用所需的全方位能力覆盖提供支持。该框架包含四大核心模块:多样化任务生成模块系统化构建涵盖各类技能的综合测试集;轻量化优化模块专注于技能提示词及对应代码的优化;对比执行模块实现原始技能与优化后技能的并行执行与评估;可追溯评估模块则依据既定标准对技能表现进行严格评判。Skills-Coach通过虚拟与真实双模式提供灵活的执行方案。为验证其有效性,我们同步推出包含48项多元化技能的基准数据集Skill-X。实验结果表明,该框架能在多类技能上实现显著的能力提升,展现了推动LLM智能体向更稳健、自适应方向发展的潜力。
English
We introduce Skills-Coach, a novel automated framework designed to significantly enhance the self-evolution of skills within Large Language Model (LLM)-based agents. Addressing the current fragmentation of the skill ecosystem, Skills-Coach explores the boundaries of skill capabilities, thereby facilitating the comprehensive competency coverage essential for intelligent applications. The framework comprises four core modules: a Diverse Task Generation Module that systematically creates a comprehensive test suite for various skills; a Lightweight Optimization Module dedicated to optimizing skill prompts and their corresponding code; a Comparative Execution Module facilitating the execution and evaluation of both original and optimized skills; and a Traceable Evaluation Module, which rigorously evaluates performance against specified criteria. Skills-Coach offers flexible execution options through its virtual and real modes. To validate its efficacy, we introduce Skill-X, a comprehensive benchmark dataset consisting of 48 diverse skills. Experimental results demonstrate that Skills-Coach achieves significant performance improvements in skill capability across a wide range of categories, highlighting its potential to advance the development of more robust and adaptable LLM-based agents.
PDF12May 7, 2026