機能的双アンカーを用いたモデルマージ

要旨

モデルマージングは、共有基盤モデルの複数のファインチューニング済みチェックポイントから知識を統合する効率的な学習後戦略である。既存手法はパラメータ空間で動作し、タスクベクトルを組み合わせることで競合を緩和するが、パラメータの不整合による制約を受ける。本論文では、代わりに入力表現空間をモデル化するフレームワークであるFunctional Dual Anchors（FDA）を提案する。FDAは合成入力であり、その誘導勾配はタスクベクトルと整合し、事前学習モデルに対するタスク特異的な機能的変化を捕捉する。この視点は、共同マルチタスク学習と事後的マージングを架橋し、堅牢性と柔軟性の両方を提供する。さらに、我々は原理に基づいた初期化手法を導入し、FDAがパラメータ空間モデルマージングと相補的であることを示す。包括的実験により、モデルマージングにおけるFDAの有効性を実証する。

English

Model merging is an efficient post-training strategy for integrating knowledge from multiple finetuned checkpoints of a shared foundation model. Existing methods operate in the parameter space, combining task vectors to mitigate conflicts, but remain constrained by parameter inconsistencies. We propose Functional Dual Anchors (FDAs), a framework that instead models the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model. This perspective bridges joint multi-task training and post-hoc merging, offering both robustness and flexibility. We further introduce a principled initialization scheme and show that FDAs are complementary to parameter-space model merging. Comprehensive experiments demonstrate the effectiveness of FDAs in model merging.

機能的双アンカーを用いたモデルマージ

Model Merging with Functional Dual Anchors

要旨

Support