MatryoshkaLoRA:學習精確的分層低秩表示以進行大型語言模型微調
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
May 8, 2026
作者: Ionut-Vlad Modoranu, Mher Safaryan, Dan Alistarh
cs.AI
摘要
隨著深度學習模型的規模擴展至數十億個參數,微調的計算成本仍是部署時的主要障礙。雖然低秩適應(LoRA)已成為參數高效微調的標準方法,但設定預先定義的固定秩(rank)需要耗費大量網格搜尋來平衡效率與效能。現有的秩適應解決方案(如DyLoRA)透過在訓練期間從預先定義的分佈中取樣秩來緩解此問題。然而,由於缺乏跨完整秩階層的一致梯度訊號,這些方法通常在高秩時產生次優結果,導致資料效率低下。本文提出MatryoshkaLoRA,一個受俄羅斯套娃啟發的通用LoRA訓練框架,藉由在現有LoRA適配器之間插入一個固定且精心設計的對角矩陣P,以相應地縮放其子秩,從而學習準確的階層式低秩表示。透過引入這項簡單修改,我們的通用框架僅需改變P即可還原LoRA與DyLoRA,並確保所有子秩有效嵌入可用的梯度資訊。MatryoshkaLoRA支援動態秩選擇,同時將精確度下降幅度降至最低。我們進一步提出秩精確度曲線下面積(AURAC)作為一致評估階層式低秩適配器效能的指標。結果顯示,MatryoshkaLoRA學習到的階層式低秩表示比先前的秩適應方法更準確,且在所評估資料集的不同秩上達成了更優異的精確度-效能權衡。我們的程式碼已公開於https://github.com/IST-DASLab/MatryoshkaLoRA。
English
With the rise in scale for deep learning models to billions of parameters, the computational cost of fine-tuning remains a significant barrier to deployment. While Low-Rank Adaptation (LoRA) has become the standard for parameter-efficient fine-tuning, the need to set a predefined, static rank r requires exhaustive grid searches to balance efficiency and performance. Existing rank-adaptive solutions such as DyLoRA mitigate this by sampling ranks during the training from a predefined distribution. However, they often yield sub-optimal results at higher ranks due to lack of consistent gradient signals across the full hierarchy of ranks, thus making these methods data-inefficient. In this paper, we propose MatryoshkaLoRA, a general, Matryoshka-inspired training framework for LoRA that learns accurate hierarchical low-rank representations by inserting a fixed, carefully crafted diagonal matrix P between the existing LoRA adapters to scale their sub-ranks accordingly. By introducing this simple modification, our general framework recovers LoRA and DyLoRA only by changing P and ensures all sub-ranks embed the available gradient information efficiently. Our MatryoshkaLoRA supports dynamic rank selection with minimal degradation in accuracy. We further propose Area Under the Rank Accuracy Curve (AURAC), a metric that consistently evaluates the performance of hierarchical low-rank adapters. Our results demonstrate that MatryoshkaLoRA learns more accurate hierarchical low-rank representations than prior rank-adaptive approaches and achieves superior accuracy-performance trade-offs across ranks on the evaluated datasets. Our code is available at https://github.com/IST-DASLab/MatryoshkaLoRA.