SingLoRA:基于单一矩阵的低秩自适应方法
SingLoRA: Low Rank Adaptation Using a Single Matrix
July 8, 2025
作者: David Bensaïd, Noam Rotstein, Roy Velich, Daniel Bensaïd, Ron Kimmel
cs.AI
摘要
低秩适应(LoRA)在大型预训练模型的参数高效微调方面取得了显著进展。LoRA通过添加两个较小矩阵的乘积来增强模型的预训练权重,这两个矩阵共同构成一个低秩矩阵更新。近期研究表明,这两个矩阵之间的尺度差异常常导致训练动态不稳定,从而影响性能优化。本文提出SingLoRA,它通过将权重更新重新表述为单个低秩矩阵与其转置的分解来改进低秩适应。这一简洁设计从根本上消除了矩阵间的尺度冲突,确保了优化的稳定性,并大致将参数数量减半。我们在无限宽度神经网络框架下分析了SingLoRA,证明其通过构造保证了稳定的特征学习。多项任务的广泛实验验证了这些优势。在常识推理任务中,使用SingLoRA微调LLama 7B模型在MNLI数据集上达到了91.3%的准确率,超越了LoRA(89.1%)和LoRA+(90.2%),同时仅使用了它们60%的参数预算。在图像生成任务中,采用SingLoRA微调Stable Diffusion显著提升了DreamBooth上的图像保真度,获得了0.151的DINO相似度评分,相比之下,DoRA和LoRA分别获得了0.148和0.143的评分。
English
Low-Rank Adaptation (LoRA) has significantly advanced parameter-efficient
fine-tuning of large pretrained models. LoRA augments the pre-trained weights
of a model by adding the product of two smaller matrices that together form a
low-rank matrix update. Recent research has shown that scale disparities
between these two matrices often cause unstable training dynamics, leading to
suboptimal performance. In this paper, we propose SingLoRA, which reformulates
low-rank adaptation by learning the weights update as a decomposition of a
single low-rank matrix multiplied by its transpose. This simple design
inherently removes inter-matrix scale conflicts, ensuring stable optimization,
and roughly halves the parameter count. We analyze SingLoRA within the
infinite-width neural network framework, showing that it guarantees stable
feature learning by construction. Extensive experiments on multiple tasks
validate these benefits. In common sense reasoning, fine-tuning LLama 7B on
MNLI with SingLoRA achieves 91.3% accuracy - surpassing LoRA (89.1%) and LoRA+
(90.2%) - while using only 60% of their parameter budget. In image generation,
fine-tuning Stable Diffusion with SingLoRA significantly improves image
fidelity on DreamBooth, achieving a DINO similarity score of 0.151, compared to
scores of 0.148 and 0.143 for DoRA and LoRA, respectively.