저순위 적응(Low-rank Adaptation)에서 각도와 강도의 분리

초록

파라미터 효율적 미세조정(Parameter-Efficient FineTuning, PEFT) 방법은 대규모 사전 학습 모델의 광범위한 보급 덕분에 최근 상당한 인기를 얻고 있습니다. 이러한 방법은 최소한의 계산 비용으로 하위 작업에 빠르게 적응할 수 있게 해줍니다. 그러나 LoRA와 같은 인기 있는 미세조정 방법은 하이퍼파라미터 선택이나 장기간의 학습 체계에서 제한된 견고성을 보여, 즉시 사용 가능한 최적의 성능을 달성하는 데 방해가 됩니다. 반면, ETHER와 같은 경계적 접근 방식은 더 큰 견고성을 제공하지만, 극도로 낮은 순위의 적응과 고정 강도의 변환으로 제한되어 적응 표현력이 감소합니다. 본 연구에서는 학습 가능한 저순위 행렬을 정규화하고 스케일링하는 새로운 미세조정 방법인 Decoupled Low-rank Adaptation(DeLoRA)를 제안합니다. DeLoRA는 변환의 거리를 제한함으로써 각도 학습과 적응 강도를 효과적으로 분리하여 성능 저하 없이 견고성을 향상시킵니다. 주제 기반 이미지 생성, 자연어 이해, 명령어 튜닝에 대한 평가를 통해 DeLoRA가 경쟁 PEFT 방법의 성능을 따라가거나 능가하면서도 더 강력한 견고성을 보임을 입증합니다. 코드는 https://github.com/ExplainableML/DeLoRA에서 확인할 수 있습니다.

English

Parameter-Efficient FineTuning (PEFT) methods have recently gained significant popularity thanks to the widespread availability of large-scale pretrained models. These methods allow for quick adaptation to downstream tasks with minimal computational cost. However, popular finetuning methods such as LoRA exhibit limited robustness when it comes to hyperparameter choices or extended training regimes, preventing optimal out-of-the-box performance. In contrast, bounded approaches, such as ETHER, provide greater robustness but are limited to extremely low-rank adaptations and fixed-strength transformations, reducing their adaptation expressive power. In this work, we propose Decoupled Low-rank Adaptation (DeLoRA), a novel finetuning method that normalizes and scales learnable low-rank matrices. By bounding the distance of the transformation, DeLoRA effectively decouples the angular learning from the adaptation strength, enhancing robustness without compromising performance. Through evaluations on subject-driven image generation, natural language understanding, and instruction tuning, we show that DeLoRA matches or surpasses performance of competing PEFT methods, while exhibiting stronger robustness. Code is available at https://github.com/ExplainableML/DeLoRA.

저순위 적응(Low-rank Adaptation)에서 각도와 강도의 분리

Decoupling Angles and Strength in Low-rank Adaptation

초록

Support