SkillOrchestra：基于技能迁移的智能体路由学习框架

摘要

复合式AI系统展现出超越单一模型的潜力，但其成功关键取决于有效的协同机制。现有路由方法存在两大局限：(1) 输入级路由器仅能进行忽略动态任务需求的粗粒度查询级决策；(2) 基于强化学习的协调器适配成本高昂，且在多轮场景中常出现路由崩溃现象——反复调用某个强大但昂贵的选项。我们提出SkillOrchestra这一技能感知型协同框架。该框架不直接端到端学习路由策略，而是从执行经验中学习细粒度技能，并建模智能体在特定技能下的能力与成本。部署时，协调器通过推断当前交互的技能需求，在明确性能-成本权衡下选择最匹配的智能体。在十个基准测试上的大量实验表明，SkillOrchestra相比最先进的基于强化学习的协调器性能提升达22.5%，且学习成本分别比Router-R1和ToolOrchestra降低700倍和300倍。这些结果证明显式技能建模能够实现可扩展、可解释且样本高效的协同机制，为数据密集的强化学习方法提供了理论替代方案。代码已开源：https://github.com/jiayuww/SkillOrchestra。

English

Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.

SkillOrchestra：基于技能迁移的智能体路由学习框架

SkillOrchestra: Learning to Route Agents via Skill Transfer

摘要

Support