작업 산술에서 가중치 분리의 이해와 적용

초록

태스크 산술은 사전 학습된 모델을 편집하는 효율적이고 훈련이 필요 없는 방법을 제공하지만, 그 성공에 대한 근본적인 이론적 설명이 부족합니다. 기존의 "가중치 분리" 개념은 간섭 없는 태스크 구성의 이상적인 결과를 설명하지만 그 근본 원인을 밝히지는 못합니다. 중요한 것은, 사전 학습된 모델(θ_0)이나 태스크 벡터(τ_t)의 어떤 본질적 특성이 이러한 분리를 가능하게 하는지가 충분히 탐구되지 않았다는 점입니다. 본 논문에서는 근본 원리로서, 모델이 서로 다른 태스크에 별도의 내부 특징을 할당하는 능력인 태스크-특징 전문화(TFS)를 제시합니다. 우리는 먼저 TFS가 가중치 분리를 위한 충분 조건임을 증명합니다. 더 중요하게는, TFS가 관측 가능한 기하학적 결과인 가중치 벡터 직교성 또한 발생시킨다는 것을 발견했습니다. 이는 TFS가 원하는 기능적 결과(분리)와 측정 가능한 기하학적 특성(직교성) 모두의 공통 원인으로 자리매김합니다. 이러한 관계는 우리 방법의 핵심 통찰력을 제공합니다. 추상적인 TFS 특성을 직접 강제하기는 어렵기 때문에, 우리는 구체적인 기하학적 결과인 직교성을 형성함으로써 가중치 분리를 촉진할 수 있습니다. 따라서 우리는 미세 조정 중에 τ_t를 구성하는 가중치 업데이트(ΔW)에 대해 내부 직교 구조를 능동적으로 강제하는 간단하고 효과적인 정규화 방법인 OrthoReg를 제안합니다. 또한 우리는 OrthoReg가 분리를 촉진한다는 것을 이론적으로 증명합니다. 다양한 실험을 통해 OrthoReg가 다양한 태스크 산술 방법의 성능을 일관되게 그리고 상당히 향상시킨다는 것을 입증합니다. 코드는 https://github.com/RL-MIND/OrthoReg{https://github.com/RL-MIND/OrthoReg}에서 확인할 수 있습니다.

English

Task arithmetic provides an efficient, training-free way to edit pre-trained models, yet lacks a fundamental theoretical explanation for its success. The existing concept of ``weight disentanglement" describes the ideal outcome of non-interfering task composition but does not reveal its underlying cause. Crucially, what intrinsic properties of the pre-trained model (θ_0) or the task vectors (τ_t) enable this disentanglement remains underexplored. In this paper, we introduce Task-Feature Specialization (TFS), a model's ability to allocate distinct internal features to different tasks, as the fundamental principle. We first prove that TFS is a sufficient condition for weight disentanglement. More importantly, we find that TFS also gives rise to an observable geometric consequence: weight vector orthogonality. This positions TFS as the common cause for both the desired functional outcome (disentanglement) and a measurable geometric property (orthogonality). This relationship provides the key insight for our method: since the abstract TFS property is intractable to enforce directly, we can instead promote weight disentanglement by shaping its concrete geometric consequence, orthogonality. Therefore, we propose OrthoReg, a simple and effective regularization method that actively enforces an internal orthogonal structure on weight updates (ΔW) that constitute τ_t during fine-tuning. And we theoretically prove that OrthoReg promotes disentanglement. Extensive experiments demonstrate that OrthoReg consistently and significantly enhances the performance of various task arithmetic methods. Code is available at https://github.com/RL-MIND/OrthoReg{https://github.com/RL-MIND/OrthoReg}.

작업 산술에서 가중치 분리의 이해와 적용

Understanding and Enforcing Weight Disentanglement in Task Arithmetic

초록

Support