DivMerge:一种基于分歧的多任务模型融合方法
DivMerge: A divergence-based model merging method for multi-tasking
September 2, 2025
作者: Touayouch Brahim, Fosse Loïc, Damnati Géraldine, Lecorvé Gwénolé
cs.AI
摘要
多任務學習(MTL)通常通過在微調前合併數據集來實現,但隨著微調模型的日益普及,出現了諸如通過任務算術進行模型合併的新方法。在此背景下,任務干擾成為一大挑戰,且隨著任務數量的增加而加劇。我們提出了一種方法,將針對不同任務訓練的模型合併為單一模型,並在所有任務上保持強勁性能。我們的方法利用詹森-香農散度來指導合併過程,無需額外的標註數據,並自動平衡任務重要性。與現有方法不同,我們的方案在任務數量增加時仍保持穩健,並持續超越先前的工作。
English
Multi-task learning (MTL) is often achieved by merging datasets before
fine-tuning, but the growing availability of fine-tuned models has led to new
approaches such as model merging via task arithmetic. A major challenge in this
setting is task interference, which worsens as the number of tasks increases.
We propose a method that merges models trained on different tasks into a single
model, maintaining strong performance across all tasks. Our approach leverages
Jensen-Shannon divergence to guide the merging process without requiring
additional labelled data, and automatically balances task importance. Unlike
existing methods, our approach remains robust as the number of tasks grows and
consistently outperforms prior work.