SkillOrchestra: スキル転移によるエージェントルーティングの学習

要旨

複合AIシステムは単体モデルを超える能力を約束するが、その成功は効果的なオーケストレーションに大きく依存する。既存のルーティング手法には二つの限界がある：(1) 入力レベルでのルーターはクエリ単位の大まかな判断しか行わず、変化するタスク要件を考慮できない；(2) RLで訓練されたオーケストレーターは適応コストが高く、マルチターンシナリオでは強力だが高コストなオプションを繰り返し呼び出す「ルーティング崩壊」が頻発する。我々はSkillOrchestraを提案する。これはスキルを意識したオーケストレーションのフレームワークであり、エンドツーエンドでルーティングポリシーを直接学習する代わりに、実行経験から細粒度のスキルを学習し、各スキルにおけるエージェント固有の能力とコストをモデル化する。本フレームワークでは、オーケストレーターが現在の対話で必要とされるスキルを推論し、明示的な性能とコストのトレードオフの下でそれらを最も満たすエージェントを選択する。10のベンチマークを用いた大規模実験により、SkillOrchestraがSoTAのRLベースオーケストレーターを最大22.5%上回り、Router-R1およびToolOrchestraと比較してそれぞれ700倍および300倍の学習コスト削減を実現することを示した。これらの結果は、明示的なスキルモデリングがスケーラブルで解釈可能、かつサンプル効率の高いオーケストレーションを可能にし、データ集約的なRLベース手法に対する原理的な代替案を提供することを示している。コードはhttps://github.com/jiayuww/SkillOrchestraで公開されている。

English

Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.

SkillOrchestra: スキル転移によるエージェントルーティングの学習

SkillOrchestra: Learning to Route Agents via Skill Transfer

要旨

Support