MatryoshkaLoRA: LLMファインチューニングのための正確な階層的低ランク表現の学習

要旨

深層学習モデルのパラメータ数が数十億規模に拡大するにつれ、ファインチューニングの計算コストが依然として実用化における大きな障壁となっている。パラメータ効率的なファインチューニングの標準的手法として低ランク適応（LoRA）が確立されているものの、事前に静的なランクrを設定する必要があるため、効率と性能のバランスを取るための網羅的なグリッドサーチが不可欠である。既存のランク適応型手法、例えばDyLoRAは、学習中にあらかじめ定義された分布からランクをサンプリングすることでこの問題を緩和する。しかし、これらの手法では、全階層のランクにわたって一貫した勾配信号が得られないため、高ランクにおいて準最適な結果となりがちであり、データ効率が悪い。本論文では、マトリョーシカに着想を得た汎用的なLoRAトレーニングフレームワークであるMatryoshkaLoRAを提案する。これは、既存のLoRAアダプタ間に固定された注意深く設計された対角行列Pを挿入し、それに応じてサブランクをスケーリングすることで、正確な階層的低ランク表現を学習する。この単純な修正を導入することで、本汎用フレームワークはPを変更するだけでLoRAおよびDyLoRAを復元し、すべてのサブランクが利用可能な勾配情報を効率的に埋め込むことを保証する。提案するMatryoshkaLoRAは、精度の低下を最小限に抑えつつ動的なランク選択をサポートする。さらに、階層的低ランクアダプタの性能を一貫して評価する指標として、ランク精度曲線下面積（AURAC）を提案する。実験結果は、MatryoshkaLoRAが既存のランク適応型手法よりも正確な階層的低ランク表現を学習し、評価したデータセットにおいてランク間で優れた精度と性能のトレードオフを達成することを示している。コードはhttps://github.com/IST-DASLab/MatryoshkaLoRAで公開している。

English

With the rise in scale for deep learning models to billions of parameters, the computational cost of fine-tuning remains a significant barrier to deployment. While Low-Rank Adaptation (LoRA) has become the standard for parameter-efficient fine-tuning, the need to set a predefined, static rank r requires exhaustive grid searches to balance efficiency and performance. Existing rank-adaptive solutions such as DyLoRA mitigate this by sampling ranks during the training from a predefined distribution. However, they often yield sub-optimal results at higher ranks due to lack of consistent gradient signals across the full hierarchy of ranks, thus making these methods data-inefficient. In this paper, we propose MatryoshkaLoRA, a general, Matryoshka-inspired training framework for LoRA that learns accurate hierarchical low-rank representations by inserting a fixed, carefully crafted diagonal matrix P between the existing LoRA adapters to scale their sub-ranks accordingly. By introducing this simple modification, our general framework recovers LoRA and DyLoRA only by changing P and ensures all sub-ranks embed the available gradient information efficiently. Our MatryoshkaLoRA supports dynamic rank selection with minimal degradation in accuracy. We further propose Area Under the Rank Accuracy Curve (AURAC), a metric that consistently evaluates the performance of hierarchical low-rank adapters. Our results demonstrate that MatryoshkaLoRA learns more accurate hierarchical low-rank representations than prior rank-adaptive approaches and achieves superior accuracy-performance trade-offs across ranks on the evaluated datasets. Our code is available at https://github.com/IST-DASLab/MatryoshkaLoRA.

MatryoshkaLoRA: LLMファインチューニングのための正確な階層的低ランク表現の学習

MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning

要旨

Support