Arch-Router: 인간 선호도에 맞춰진 대형 언어 모델 라우팅

초록

대규모 언어 모델(LLM)이 빠르게 확산되면서 각기 다른 강점, 스타일, 또는 지연 시간/비용 프로파일에 최적화된 다양한 모델을 효과적으로 활용하기 위해 라우팅 기술이 필수적으로 자리 잡았습니다. 그러나 기존의 LLM 라우팅 접근 방식은 두 가지 주요한 한계를 가지고 있습니다: 첫째, 인간의 선호도를 반영하는 주관적 평가 기준을 종종 포착하지 못하는 벤치마크를 사용하여 성능을 평가하고, 둘째, 일반적으로 제한된 모델 풀에서 선택을 합니다. 본 연구에서는 사용자 정의 도메인(예: 여행) 또는 작업 유형(예: 이미지 편집)에 쿼리를 매칭하여 모델 선택을 안내하는 선호도 정렬 라우팅 프레임워크를 제안합니다. 이는 라우팅 결정에 선호도를 인코딩하는 실용적인 메커니즘을 제공합니다. 구체적으로, 우리는 쿼리를 도메인-작업 선호도에 매핑하여 모델 라우팅 결정을 학습하는 1.5B 크기의 컴팩트 모델인 Arch-Router를 소개합니다. 우리의 접근 방식은 또한 재훈련이나 아키텍처 수정 없이 새로운 모델을 라우팅에 원활하게 추가할 수 있도록 지원합니다. 대화 데이터셋에 대한 실험 결과, 우리의 접근 방식은 인간의 선호도와 쿼리를 매칭하는 데 있어 최첨단(SOTA) 결과를 달성하며, 주요 상용 모델을 능가하는 성능을 보여줍니다. 우리의 접근 방식은 주관적 평가 기준을 포착하고 라우팅 결정을 더 투명하고 유연하게 만듭니다. 우리의 모델은 https://huggingface.co/katanemo/Arch-Router-1.5B에서 이용 가능합니다.

English

With the rapid proliferation of large language models (LLMs) -- each optimized for different strengths, style, or latency/cost profile -- routing has become an essential technique to operationalize the use of different models. However, existing LLM routing approaches are limited in two key ways: they evaluate performance using benchmarks that often fail to capture human preferences driven by subjective evaluation criteria, and they typically select from a limited pool of models. In this work, we propose a preference-aligned routing framework that guides model selection by matching queries to user-defined domains (e.g., travel) or action types (e.g., image editing) -- offering a practical mechanism to encode preferences in routing decisions. Specifically, we introduce Arch-Router, a compact 1.5B model that learns to map queries to domain-action preferences for model routing decisions. Our approach also supports seamlessly adding new models for routing without requiring retraining or architectural modifications. Experiments on conversational datasets demonstrate that our approach achieves state-of-the-art (SOTA) results in matching queries with human preferences, outperforming top proprietary models. Our approach captures subjective evaluation criteria and makes routing decisions more transparent and flexible. Our model is available at: https://huggingface.co/katanemo/Arch-Router-1.5B.

Arch-Router: 인간 선호도에 맞춰진 대형 언어 모델 라우팅

Arch-Router: Aligning LLM Routing with Human Preferences

초록

Support