SingLoRA:使用單一矩陣的低秩適應
SingLoRA: Low Rank Adaptation Using a Single Matrix
July 8, 2025
作者: David Bensaïd, Noam Rotstein, Roy Velich, Daniel Bensaïd, Ron Kimmel
cs.AI
摘要
低秩适应(LoRA)在大型预训练模型的参数高效微调方面取得了显著进展。LoRA通过添加两个较小矩阵的乘积来增强模型的预训练权重,这两个矩阵共同构成一个低秩矩阵更新。最近的研究表明,这两个矩阵之间的尺度差异常常导致训练动态不稳定,从而影响性能表现。本文提出SingLoRA,它通过将权重更新重新表述为单个低秩矩阵与其转置的分解来实现低秩适应。这一简洁设计从根本上消除了矩阵间的尺度冲突,确保了优化的稳定性,并大致将参数数量减半。我们在无限宽度神经网络框架下分析了SingLoRA,证明其通过构造保证了稳定的特征学习。多项任务的广泛实验验证了这些优势。在常识推理任务中,使用SingLoRA对LLama 7B在MNLI数据集上进行微调,准确率达到91.3%,超越了LoRA(89.1%)和LoRA+(90.2%),同时仅使用了它们60%的参数预算。在图像生成任务中,使用SingLoRA微调Stable Diffusion显著提升了DreamBooth上的图像保真度,DINO相似度得分达到0.151,而DoRA和LoRA的得分分别为0.148和0.143。
English
Low-Rank Adaptation (LoRA) has significantly advanced parameter-efficient
fine-tuning of large pretrained models. LoRA augments the pre-trained weights
of a model by adding the product of two smaller matrices that together form a
low-rank matrix update. Recent research has shown that scale disparities
between these two matrices often cause unstable training dynamics, leading to
suboptimal performance. In this paper, we propose SingLoRA, which reformulates
low-rank adaptation by learning the weights update as a decomposition of a
single low-rank matrix multiplied by its transpose. This simple design
inherently removes inter-matrix scale conflicts, ensuring stable optimization,
and roughly halves the parameter count. We analyze SingLoRA within the
infinite-width neural network framework, showing that it guarantees stable
feature learning by construction. Extensive experiments on multiple tasks
validate these benefits. In common sense reasoning, fine-tuning LLama 7B on
MNLI with SingLoRA achieves 91.3% accuracy - surpassing LoRA (89.1%) and LoRA+
(90.2%) - while using only 60% of their parameter budget. In image generation,
fine-tuning Stable Diffusion with SingLoRA significantly improves image
fidelity on DreamBooth, achieving a DINO similarity score of 0.151, compared to
scores of 0.148 and 0.143 for DoRA and LoRA, respectively.