DR-LoRA:面向专家混合模型适配的动态秩低秩自适应方法
DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation
January 8, 2026
作者: Guanzhi Deng, Bo Li, Ronghao Chen, Huacan Wang, Linqi Song, Lijie Wen
cs.AI
摘要
专家混合模型(MoE)已成为扩展大语言模型(LLM)的重要范式。参数高效微调技术(如LoRA)被广泛用于将预训练的MoE大模型适配至下游任务。然而现有方法对所有专家模块采用统一的LoRA秩配置,忽视了MoE模型中专家功能分化的内在特性。这种均质化分配会导致资源错配:任务相关专家容量不足,而次要专家却获得冗余参数。我们提出动态秩LoRA框架DR-LoRA,其能根据任务需求在微调过程中动态调整专家LoRA秩。该框架采用专家显著性评分机制,综合考量专家路由频率与LoRA秩重要性,量化每个专家对扩展容量的需求。具有较高显著性得分的专家将优先进行秩扩展,从而自动形成契合目标任务的异构秩分布。在多基准测试上的实验表明,在相同参数预算下,DR-LoRA持续优于标准LoRA及静态分配策略,通过更高效的参数利用实现了更优的任务性能。
English
Mixture-of-Experts (MoE) has become a prominent paradigm for scaling Large Language Models (LLMs). Parameter-efficient fine-tuning (PEFT), such as LoRA, is widely adopted to adapt pretrained MoE LLMs to downstream tasks. However, existing approaches assign identical LoRA ranks to all experts, overlooking the intrinsic functional specialization within MoE LLMs. This uniform allocation leads to resource mismatch, task-relevant experts are under-provisioned while less relevant ones receive redundant parameters. We propose a Dynamic Rank LoRA framework named DR-LoRA, which dynamically grows expert LoRA ranks during fine-tuning based on task-specific demands. DR-LoRA employs an Expert Saliency Scoring mechanism that integrates expert routing frequency and LoRA rank importance to quantify each expert's demand for additional capacity. Experts with higher saliency scores are prioritized for rank expansion, enabling the automatic formation of a heterogeneous rank distribution tailored to the target task. Experiments on multiple benchmarks demonstrate that DR-LoRA consistently outperforms standard LoRA and static allocation strategies under the same parameter budget, achieving superior task performance with more efficient parameter utilization.