MatryoshkaLoRA:为大语言模型微调学习精确的层次化低秩表示
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
May 8, 2026
作者: Ionut-Vlad Modoranu, Mher Safaryan, Dan Alistarh
cs.AI
摘要
随着深度学习模型参数规模增长至数十亿,微调的计算成本仍是部署中的重大障碍。尽管低秩适应(LoRA)已成为参数高效微调的标准方法,但预设静态秩r的需求需要穷举网格搜索来平衡效率与性能。现有秩自适应方案(如DyLoRA)通过在训练过程中从预定义分布中采样秩来缓解此问题,但由于缺乏跨完整秩层次的梯度信号一致性,这类方法在高秩区域常产生次优结果,导致数据利用效率低下。本文提出MatryoshkaLoRA——一种基于套娃思想的通用LoRA训练框架,通过在现有LoRA适配器之间插入精心设计的固定对角矩阵P,按比例缩放其子秩,从而学习精确的分层低秩表示。这一简单修改使通用框架通过改变P即可恢复LoRA与DyLoRA,并确保所有子秩有效嵌入可用梯度信息。我们的MatryoshkaLoRA支持动态秩选择,且准确率损失极小。我们进一步提出秩准确率曲线下面积(AURAC)指标,用于一致评估分层低秩适配器的性能。实验结果表明,相较于先前的秩自适应方法,MatryoshkaLoRA学习了更精确的分层低秩表示,并在所有测试数据集上实现了更优的准确率-性能权衡。我们的代码已开源:https://github.com/IST-DASLab/MatryoshkaLoRA。
English
With the rise in scale for deep learning models to billions of parameters, the computational cost of fine-tuning remains a significant barrier to deployment. While Low-Rank Adaptation (LoRA) has become the standard for parameter-efficient fine-tuning, the need to set a predefined, static rank r requires exhaustive grid searches to balance efficiency and performance. Existing rank-adaptive solutions such as DyLoRA mitigate this by sampling ranks during the training from a predefined distribution. However, they often yield sub-optimal results at higher ranks due to lack of consistent gradient signals across the full hierarchy of ranks, thus making these methods data-inefficient. In this paper, we propose MatryoshkaLoRA, a general, Matryoshka-inspired training framework for LoRA that learns accurate hierarchical low-rank representations by inserting a fixed, carefully crafted diagonal matrix P between the existing LoRA adapters to scale their sub-ranks accordingly. By introducing this simple modification, our general framework recovers LoRA and DyLoRA only by changing P and ensures all sub-ranks embed the available gradient information efficiently. Our MatryoshkaLoRA supports dynamic rank selection with minimal degradation in accuracy. We further propose Area Under the Rank Accuracy Curve (AURAC), a metric that consistently evaluates the performance of hierarchical low-rank adapters. Our results demonstrate that MatryoshkaLoRA learns more accurate hierarchical low-rank representations than prior rank-adaptive approaches and achieves superior accuracy-performance trade-offs across ranks on the evaluated datasets. Our code is available at https://github.com/IST-DASLab/MatryoshkaLoRA.