SingLoRA: 단일 행렬을 활용한 저랭크 적응 기법

초록

Low-Rank Adaptation (LoRA)은 대규모 사전 학습 모델의 파라미터 효율적 미세 조정을 크게 발전시켰습니다. LoRA는 모델의 사전 학습된 가중치에 두 개의 더 작은 행렬의 곱을 추가하여 저랭크 행렬 업데이트를 형성합니다. 최근 연구에 따르면, 이 두 행렬 간의 스케일 차이가 종종 불안정한 학습 동역학을 유발하여 최적의 성능을 달성하지 못하는 것으로 나타났습니다. 본 논문에서는 SingLoRA를 제안합니다. SingLoRA는 저랭크 적응을 재구성하여 가중치 업데이트를 단일 저랭크 행렬과 그 전치 행렬의 곱으로 분해하여 학습합니다. 이 간단한 설계는 행렬 간 스케일 충돌을 본질적으로 제거하여 안정적인 최적화를 보장하며, 파라미터 수를 대략 절반으로 줄입니다. 우리는 SingLoRA를 무한 폭 신경망 프레임워크 내에서 분석하며, 이 설계가 안정적인 특징 학습을 보장함을 보여줍니다. 다양한 작업에 대한 광범위한 실험을 통해 이러한 이점을 검증했습니다. 상식 추론에서, LLama 7B를 MNLI 데이터셋에 대해 SingLoRA로 미세 조정한 결과 91.3%의 정확도를 달성하여 LoRA(89.1%)와 LoRA+(90.2%)를 능가했으며, 이는 그들의 파라미터 예산의 60%만 사용했습니다. 이미지 생성에서는 Stable Diffusion을 SingLoRA로 미세 조정하여 DreamBooth에서 이미지 충실도를 크게 개선했으며, DINO 유사도 점수로 0.151을 달성했습니다. 이는 DoRA(0.148)와 LoRA(0.143)보다 우수한 성능입니다.

English

Low-Rank Adaptation (LoRA) has significantly advanced parameter-efficient fine-tuning of large pretrained models. LoRA augments the pre-trained weights of a model by adding the product of two smaller matrices that together form a low-rank matrix update. Recent research has shown that scale disparities between these two matrices often cause unstable training dynamics, leading to suboptimal performance. In this paper, we propose SingLoRA, which reformulates low-rank adaptation by learning the weights update as a decomposition of a single low-rank matrix multiplied by its transpose. This simple design inherently removes inter-matrix scale conflicts, ensuring stable optimization, and roughly halves the parameter count. We analyze SingLoRA within the infinite-width neural network framework, showing that it guarantees stable feature learning by construction. Extensive experiments on multiple tasks validate these benefits. In common sense reasoning, fine-tuning LLama 7B on MNLI with SingLoRA achieves 91.3% accuracy - surpassing LoRA (89.1%) and LoRA+ (90.2%) - while using only 60% of their parameter budget. In image generation, fine-tuning Stable Diffusion with SingLoRA significantly improves image fidelity on DreamBooth, achieving a DINO similarity score of 0.151, compared to scores of 0.148 and 0.143 for DoRA and LoRA, respectively.

SingLoRA: 단일 행렬을 활용한 저랭크 적응 기법

SingLoRA: Low Rank Adaptation Using a Single Matrix

초록

Support