DivMerge:一种基于分歧的多任务模型融合方法
DivMerge: A divergence-based model merging method for multi-tasking
September 2, 2025
作者: Touayouch Brahim, Fosse Loïc, Damnati Géraldine, Lecorvé Gwénolé
cs.AI
摘要
多任务学习(MTL)通常通过合并数据集后进行微调来实现,但随着微调模型的日益普及,出现了诸如通过任务算术进行模型融合的新方法。在此背景下,一个主要挑战是任务干扰,随着任务数量的增加,这一问题会加剧。我们提出了一种方法,将针对不同任务训练的模型合并为一个单一模型,确保在所有任务上均保持强劲性能。我们的方法利用Jensen-Shannon散度来指导融合过程,无需额外标注数据,并能自动平衡任务重要性。与现有方法不同,我们的方法在任务数量增加时仍保持稳健,并持续超越先前的工作。
English
Multi-task learning (MTL) is often achieved by merging datasets before
fine-tuning, but the growing availability of fine-tuned models has led to new
approaches such as model merging via task arithmetic. A major challenge in this
setting is task interference, which worsens as the number of tasks increases.
We propose a method that merges models trained on different tasks into a single
model, maintaining strong performance across all tasks. Our approach leverages
Jensen-Shannon divergence to guide the merging process without requiring
additional labelled data, and automatically balances task importance. Unlike
existing methods, our approach remains robust as the number of tasks grows and
consistently outperforms prior work.