一體適用:通用LoRA用於參數高效微調
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
June 13, 2023
作者: Arnav Chavan, Zhuang Liu, Deepak Gupta, Eric Xing, Zhiqiang Shen
cs.AI
摘要
我們提出了廣義低秩適應(GLoRA),這是一種用於通用參數高效微調任務的先進方法。在增強低秩適應(LoRA)的基礎上,GLoRA採用了一個廣義提示模組來優化預訓練模型的權重並調整中間激活,提供更靈活和能力更強的跨多樣任務和數據集。此外,GLoRA通過採用可擴展的、模塊化的、逐層結構搜索來實現高效的參數適應,學習每層的單獨適配器。源自統一的數學公式,GLoRA展現出強大的遷移學習、少樣本學習和領域泛化能力,通過在權重和激活上增加額外維度來適應新任務。全面的實驗表明,GLoRA在各種數據集上的自然、專業和結構基準中優於所有先前方法,在實現更高準確性的同時使用更少的參數和計算。此外,我們的結構性重新參數化設計確保GLoRA不會產生額外的推理成本,使其成為資源有限應用的實用解決方案。代碼可在以下鏈接找到:https://github.com/Arnav0400/ViT-Slim/tree/master/GLoRA。
English
We present Generalized LoRA (GLoRA), an advanced approach for universal
parameter-efficient fine-tuning tasks. Enhancing Low-Rank Adaptation (LoRA),
GLoRA employs a generalized prompt module to optimize pre-trained model weights
and adjust intermediate activations, providing more flexibility and capability
across diverse tasks and datasets. Moreover, GLoRA facilitates efficient
parameter adaptation by employing a scalable, modular, layer-wise structure
search that learns individual adapter of each layer. Originating from a unified
mathematical formulation, GLoRA exhibits strong transfer learning, few-shot
learning and domain generalization abilities, as it adjusts to new tasks
through additional dimensions on weights and activations. Comprehensive
experiments demonstrate that GLoRA outperforms all previous methods in natural,
specialized, and structured benchmarks, achieving superior accuracy with fewer
parameters and computations on various datasets. Furthermore, our structural
re-parameterization design ensures that GLoRA incurs no extra inference cost,
rendering it a practical solution for resource-limited applications. Code is
available at: https://github.com/Arnav0400/ViT-Slim/tree/master/GLoRA.