专业化幻象:揭秘混合专家模型中领域不变的“常务委员会”
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models
January 6, 2026
作者: Yan Wang, Yitao Xu, Nanhan Shen, Jinyan Su, Jimin Huang, Zining Zhu
cs.AI
摘要
专家混合模型被广泛认为通过稀疏路由机制实现领域专业化。本研究通过引入COMMITTEEAUDIT后验分析框架,对上述假设提出质疑——该框架从专家群体层面而非个体专家角度分析路由行为。通过对三个代表性模型和MMLU基准测试的实证研究,我们发现存在领域不变的"常务委员会"现象:这是一个由被路由专家组成的紧凑联盟,在不同领域、网络层级和路由预算下始终占据路由质量的主导地位,即使在已包含共享专家的模型架构中亦然。定性分析进一步表明,常务委员会负责锚定推理结构和语法框架,而边缘专家则处理领域特定知识。这些发现揭示了模型存在强烈的中心化计算结构偏好,表明专家混合模型的专业化程度远低于普遍认知。这种固有偏好同时暗示,当前训练目标(如强制均衡专家使用率的负载平衡损失函数)可能违背模型的自然优化路径,从而限制训练效率与性能表现。
English
Mixture of Experts models are widely assumed to achieve domain specialization through sparse routing. In this work, we question this assumption by introducing COMMITTEEAUDIT, a post hoc framework that analyzes routing behavior at the level of expert groups rather than individual experts. Across three representative models and the MMLU benchmark, we uncover a domain-invariant Standing Committee. This is a compact coalition of routed experts that consistently captures the majority of routing mass across domains, layers, and routing budgets, even when architectures already include shared experts. Qualitative analysis further shows that Standing Committees anchor reasoning structure and syntax, while peripheral experts handle domain-specific knowledge. These findings reveal a strong structural bias toward centralized computation, suggesting that specialization in Mixture of Experts models is far less pervasive than commonly believed. This inherent bias also indicates that current training objectives, such as load-balancing losses that enforce uniform expert utilization, may be working against the model's natural optimization path, thereby limiting training efficiency and performance.