IR3DE:一种面向大型语言模型的线性路由器
IR3DE: A Linear Router for Large Language Models
June 4, 2026
作者: Eros Fanì, Oğuzhan Ersoy
cs.AI
摘要
基础大语言模型(LLMs)在广泛通用任务中展现出卓越能力,并通过领域专家LLM在各种专业任务中取得了显著成果。随着可用LLM数量持续增长,推理路由器被提出以针对每个提示选择最合适的LLM。然而,现有路由方法要么在弱到强通用LLM之间优化成本,要么需要大量训练以支持领域专家路由。本文提出IR3DE——一种基于岭回归的领域专家路由器,能够为每个提示提供低成本、快速的路由决策。我们在两种因果语言建模(CLM)设置下评估IR3DE,其中所有领域的任务均为下一词预测;并在一种推理设置下进行评估,其中每个领域拥有其独特的推理任务。尽管是线性路由器,IR3DE在两种CLM设置中均达到与其他基线相当的性能,在推理设置中更胜一筹,归一化性能达98.4%。此外,IR3DE支持无需从头重新训练路由器即可添加或移除新的领域专家,从而能以最小化对路由器本身的影响服务动态LLM集合。我们的代码开源在:github.com/gensyn-ai/IR3DE。
English
Foundational Large Language Models (LLMs) demonstrate proficiency on a wide range of general tasks, and achieve remarkable results on various specialized tasks via domain-expert LLMs. With the ever-growing list of available LLMs, inference routers are being proposed to select the most appropriate LLM for each prompt. However, existing routing methods either optimize cost across weak-to-strong generalist LLMs or require substantial training to support domain-expertise routing. In this paper, we propose IR3DE, a Ridge Regression-based Router for Domain Experts that provides cheap and fast routing decisions for each prompt. We evaluate IR3DE in two Causal Language Modeling (CLM) settings where the tasks are next-token prediction for all domains, and one reasoning setting where each domain has its own distinct reasoning task. Despite being a linear router, IR3DE achieves performance comparable to the other baselines in both CLM settings, and surpassing them in the reasoning setting, with a normalized performance of 98.4%. Moreover, IR3DE enables the addition or removal of new domain experts without requiring the router to be retrained from scratch, allowing a dynamic set of LLMs to be served with minimal disruption to the router itself. Our code is available at: github.com/gensyn-ai/IR3DE.