B空间拥塞：面向LoRA模型融合的共享方向校准

摘要

合并独立训练的LoRA适配器是替代联合多任务训练的一种实用方案，但往往会损害性能。现有方法通常将LoRA更新量ΔW=BA视为单一对象，未区分两个LoRA矩阵的作用。我们发现合并干扰的主要来源是输出侧矩阵B：跨任务时B会重复使用少量共享方向，而A则更具任务特异性。这导致合并后的适配器过度强调共享方向，使任务特定信息丢失。我们提出Pico（输出空间预合并干扰校准），这种无需数据的方法通过在合并前缩减B矩阵的过度共享方向，并对合并后的更新量进行重缩放来实现校准。Pico可直接嵌入现有合并方法（如Task Arithmetic、TIES和TSV-M）使用。在涵盖数学、编程、金融和医疗领域的八项基准测试中，Pico相较基线方法将平均准确率提升3.4-8.3个百分点，达到最佳综合平均性能。该方法甚至使合并后的适配器表现优于使用全量任务数据训练的LoRA。这些结果表明，当分别处理两个LoRA矩阵时，合并操作能获得更好效果。

English

Merging separately trained LoRA adapters is a practical alternative to joint multi-task training, but it often hurts performance. Existing methods usually treat the LoRA update ΔW = BA as a single object and do not distinguish the two LoRA matrices. We show that the main source of LoRA merge interference comes from the output-side matrix B. Across tasks, B repeatedly uses a small set of shared directions, while A remains much more task-specific. As a result, the merged adapter overemphasizes these shared directions, and task-specific information is lost. We propose Pico (Pre-merge interference calibration in output-space), a data-free method that calibrates B before merge by downscaling over-shared directions and then rescaling the merged update. Pico plugs directly into existing merging methods such as Task Arithmetic, TIES, and TSV-M. Across eight different benchmarks from math, coding, finance, and medical domains, Pico improves average accuracy by 3.4-8.3 points over the corresponding base method and achieves the best overall average performance. Pico also enables merged adapters to outperform the LoRA trained with all task data. These results show that LoRA merging works better when the two LoRA matrices are treated separately.