ChatPaper.aiChatPaper

基于功能双锚点的模型融合方法

Model Merging with Functional Dual Anchors

October 24, 2025
作者: Kexuan Shi, Yandong Wen, Weiyang Liu
cs.AI

摘要

模型融合是一种高效的后训练策略,能够整合共享基础模型多个微调检查点的知识。现有方法在参数空间中进行操作,通过组合任务向量来缓解冲突,但仍受限于参数不一致性。我们提出功能双重锚点(FDA)框架,该方法转而对输入表征空间进行建模。FDA是合成的输入样本,其诱导出的梯度与任务向量对齐,能够捕捉相对于预训练模型的任务特定功能偏移。这一视角搭建了联合多任务训练与事后融合之间的桥梁,兼具鲁棒性与灵活性。我们进一步提出一种理论驱动的初始化方案,并证明FDA与参数空间模型融合具有互补性。综合实验结果表明了FDA在模型融合中的有效性。
English
Model merging is an efficient post-training strategy for integrating knowledge from multiple finetuned checkpoints of a shared foundation model. Existing methods operate in the parameter space, combining task vectors to mitigate conflicts, but remain constrained by parameter inconsistencies. We propose Functional Dual Anchors (FDAs), a framework that instead models the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model. This perspective bridges joint multi-task training and post-hoc merging, offering both robustness and flexibility. We further introduce a principled initialization scheme and show that FDAs are complementary to parameter-space model merging. Comprehensive experiments demonstrate the effectiveness of FDAs in model merging.
PDF121December 17, 2025