SkillOrchestra: 스킬 전이를 통한 에이전트 라우팅 학습

초록

복합 AI 시스템은 단일 모델의 능력을 넘어서는 성능을 약속하지만, 그 성공은 효과적인 오케스트레이션에 크게 의존합니다. 기존 라우팅 접근법은 두 가지 한계에 직면합니다: (1) 입력 수준 라우터는 변화하는 작업 요구사항을 무시하는 coarse 쿼리 수준 결정을 내립니다; (2) RL로 훈련된 오케스트레이터는 적응 비용이 높으며, 다중 턴 시나리오에서 강력하지만 비용이 큰 단일 옵션을 반복적으로 호출하는 라우팅 붕괴 문제가 자주 발생합니다. 본 연구에서는 기술 인식 오케스트레이션을 위한 SkillOrchestra 프레임워크를 소개합니다. SkillOrchestra는 종단간 라우팅 정책을 직접 학습하는 대신, 실행 경험에서 세분화된 기술을 학습하고 해당 기술 하에서 에이전트별 역량과 비용을 모델링합니다. 배포 시 오케스트레이터는 현재 상호작용의 기술 수요를 추론하고 명시적 성능-비용 트레이드오프 하에서 이를 가장 잘 충족하는 에이전트를 선택합니다. 10개 벤치마크에 걸친 광범위한 실험을 통해 SkillOrchestra가 SoTA RL 기반 오케스트레이터 대비 최대 22.5% 성능 향상을 보였으며, Router-R1 및 ToolOrchestra 대비 각각 700배 및 300배 학습 비용 절감 효과를 입증했습니다. 이러한 결과는 명시적 기술 모델링이 확장 가능하고 해석 가능하며 샘플 효율적인 오케스트레이션을 가능하게 하여, 데이터 집약적인 RL 기반 접근법에 대한 원칙적인 대안을 제공함을 보여줍니다. 코드는 https://github.com/jiayuww/SkillOrchestra에서 이용 가능합니다.

English

Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.

SkillOrchestra: 스킬 전이를 통한 에이전트 라우팅 학습

SkillOrchestra: Learning to Route Agents via Skill Transfer

초록

Support