Arch-Router: 人間の選好に沿ったLLMルーティングの調整

要旨

大規模言語モデル（LLM）の急速な普及に伴い、それぞれが異なる強み、スタイル、またはレイテンシ/コストプロファイルに最適化されたモデルを運用するためには、ルーティングが不可欠な技術となっています。しかし、既存のLLMルーティング手法には2つの重要な制限があります。まず、人間の好みを反映する主観的評価基準を捉えられないベンチマークを使用して性能を評価している点、そして通常は限られたモデルのプールから選択している点です。本研究では、クエリをユーザー定義のドメイン（例：旅行）やアクションタイプ（例：画像編集）にマッチングすることでモデル選択を導く、好みに沿ったルーティングフレームワークを提案します。これにより、ルーティング決定に好みをエンコードする実用的なメカニズムを提供します。具体的には、クエリをドメイン-アクションの好みにマッピングしてモデルルーティング決定を行う、コンパクトな1.5BモデルであるArch-Routerを導入します。また、このアプローチは、再トレーニングやアーキテクチャの変更を必要とせずに、新しいモデルをシームレスにルーティングに追加することをサポートします。会話データセットでの実験により、本アプローチが人間の好みにクエリをマッチングする点で最先端（SOTA）の結果を達成し、主要なプロプライエタリモデルを上回ることが示されました。本アプローチは主観的評価基準を捉え、ルーティング決定をより透明かつ柔軟にします。本モデルは以下で利用可能です：https://huggingface.co/katanemo/Arch-Router-1.5B。

English

With the rapid proliferation of large language models (LLMs) -- each optimized for different strengths, style, or latency/cost profile -- routing has become an essential technique to operationalize the use of different models. However, existing LLM routing approaches are limited in two key ways: they evaluate performance using benchmarks that often fail to capture human preferences driven by subjective evaluation criteria, and they typically select from a limited pool of models. In this work, we propose a preference-aligned routing framework that guides model selection by matching queries to user-defined domains (e.g., travel) or action types (e.g., image editing) -- offering a practical mechanism to encode preferences in routing decisions. Specifically, we introduce Arch-Router, a compact 1.5B model that learns to map queries to domain-action preferences for model routing decisions. Our approach also supports seamlessly adding new models for routing without requiring retraining or architectural modifications. Experiments on conversational datasets demonstrate that our approach achieves state-of-the-art (SOTA) results in matching queries with human preferences, outperforming top proprietary models. Our approach captures subjective evaluation criteria and makes routing decisions more transparent and flexible. Our model is available at: https://huggingface.co/katanemo/Arch-Router-1.5B.

Arch-Router: 人間の選好に沿ったLLMルーティングの調整

Arch-Router: Aligning LLM Routing with Human Preferences

要旨

Support