解耦低秩適應中的角度與強度
Decoupling Angles and Strength in Low-rank Adaptation
March 23, 2025
作者: Massimo Bini, Leander Girrbach, Zeynep Akata
cs.AI
摘要
參數高效微調(PEFT)方法近年來因大規模預訓練模型的廣泛可用性而受到顯著關注。這些方法能夠以最小的計算成本快速適應下游任務。然而,像LoRA這樣的流行微調方法在超參數選擇或長時間訓練方案方面表現出有限的魯棒性,阻礙了其開箱即用的最佳性能。相比之下,如ETHER等有界方法提供了更高的魯棒性,但僅限於極低秩的適應和固定強度的轉換,降低了其適應表達能力。在本研究中,我們提出了一種新穎的微調方法——解耦低秩適應(DeLoRA),該方法對可學習的低秩矩陣進行歸一化和縮放。通過限制轉換的距離,DeLoRA有效地將角度學習與適應強度解耦,在不影響性能的情況下增強了魯棒性。通過在主題驅動圖像生成、自然語言理解和指令微調上的評估,我們展示了DeLoRA在性能上匹配或超越了競爭的PEFT方法,同時表現出更強的魯棒性。代碼可在https://github.com/ExplainableML/DeLoRA獲取。
English
Parameter-Efficient FineTuning (PEFT) methods have recently gained
significant popularity thanks to the widespread availability of large-scale
pretrained models. These methods allow for quick adaptation to downstream tasks
with minimal computational cost. However, popular finetuning methods such as
LoRA exhibit limited robustness when it comes to hyperparameter choices or
extended training regimes, preventing optimal out-of-the-box performance. In
contrast, bounded approaches, such as ETHER, provide greater robustness but are
limited to extremely low-rank adaptations and fixed-strength transformations,
reducing their adaptation expressive power. In this work, we propose Decoupled
Low-rank Adaptation (DeLoRA), a novel finetuning method that normalizes and
scales learnable low-rank matrices. By bounding the distance of the
transformation, DeLoRA effectively decouples the angular learning from the
adaptation strength, enhancing robustness without compromising performance.
Through evaluations on subject-driven image generation, natural language
understanding, and instruction tuning, we show that DeLoRA matches or surpasses
performance of competing PEFT methods, while exhibiting stronger robustness.
Code is available at https://github.com/ExplainableML/DeLoRA.Summary
AI-Generated Summary