ShadowPEFT: パラメータ効率型ファインチューニングのためのシャドウネットワーク

要旨

パラメータ効率型ファインチューニング（PEFT）は、大規模言語モデル（LLM）の全パラメータファインチューニングにおける学習コストを、事前学習済みのバックボーンを固定したまま、タスク固有の少数のパラメータのみを学習することで削減する手法である。しかし、Low-Rank Adaptation（LoRA）のような既存の手法は、独立した低ランクの摂動を個々の重みに直接付加することで適応を実現しており、適応の局所的なパラメータ化をもたらす。本研究では、集中型PEFTフレームワークであるShadowPEFTを提案する。これは、深度共有のシャドウモジュールを通じて層レベルの洗練を行うものである。ShadowPEFTは各トランスフォーマー層において並列のシャドウ状態を維持し、それを繰り返し進化させることで、次第に豊富な隠れ状態を生成する。この設計により、適応は分散的な重み空間の摂動から、共有された層空間における洗練プロセスへと移行する。シャドウモジュールはバックボーンから分離されているため、深度を超えて再利用可能であり、独立して事前学習でき、必要に応じて分離モードで展開できるため、エッジコンピューティングシナリオに有益である。生成タスクと理解タスクのベンチマークによる実験では、ShadowPEFTが同程度の学習可能パラメータ予算のもとで、LoRAやDoRAに匹敵するか、あるいはそれを上回る性能を示した。シャドウ事前学習、クロスデータセット転移、パラメータスケーリング、推論レイテンシ、システムレベル評価に関する追加分析は、集中型の層空間適応が従来の低ランクPEFTに対して競争力と柔軟性を備えた代替手法であることを示唆している。

English

Parameter-efficient fine-tuning (PEFT) reduces the training cost of full-parameter fine-tuning for large language models (LLMs) by training only a small set of task-specific parameters while freezing the pretrained backbone. However, existing approaches, such as Low-Rank Adaptation (LoRA), achieve adaptation by inserting independent low-rank perturbations directly to individual weights, resulting in a local parameterization of adaptation. We propose ShadowPEFT, a centralized PEFT framework that instead performs layer-level refinement through a depth-shared shadow module. At each transformer layer, ShadowPEFT maintains a parallel shadow state and evolves it repeatedly for progressively richer hidden states. This design shifts adaptation from distributed weight-space perturbations to a shared layer-space refinement process. Since the shadow module is decoupled from the backbone, it can be reused across depth, independently pretrained, and optionally deployed in a detached mode, benefiting edge computing scenarios. Experiments on generation and understanding benchmarks show that ShadowPEFT matches or outperforms LoRA and DoRA under comparable trainable-parameter budgets. Additional analyses on shadow pretraining, cross-dataset transfer, parameter scaling, inference latency, and system-level evaluation suggest that centralized layer-space adaptation is a competitive and flexible alternative to conventional low-rank PEFT.

ShadowPEFT: パラメータ効率型ファインチューニングのためのシャドウネットワーク

ShadowPEFT: Shadow Network for Parameter-Efficient Fine-Tuning

要旨

Support