GIFT-SW:對LLM進行高斯噪音注入微調突出權重
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
August 27, 2024
作者: Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov, Egor Venediktov, Mariya Krylova, Aleksandr Zuev, Evgeny Burnaev
cs.AI
摘要
參數高效微調(PEFT)方法已經變得流行並使大型語言模型(LLMs)的使用民主化。最近的研究顯示,一小部分權重顯著影響性能。基於這一觀察,我們引入了一種新的PEFT方法,稱為注入高斯噪聲以微調顯著權重(GIFT-SW)。我們的方法僅更新顯著列,同時在非顯著列中注入高斯噪聲。為了識別這些列,我們開發了一種廣義敏感度度量,擴展並統一了先前研究中的度量。對LLaMA模型的實驗表明,GIFT-SW在相同的計算預算下優於完全微調和現代PEFT方法。此外,GIFT-SW提供了實際優勢,可以在保持完整精度的顯著權重的情況下,恢復經過混合精度量化處理的模型的性能。
English
Parameter Efficient Fine-Tuning (PEFT) methods have gained popularity and
democratized the usage of Large Language Models (LLMs). Recent studies have
shown that a small subset of weights significantly impacts performance. Based
on this observation, we introduce a novel PEFT method, called Gaussian noise
Injected Fine Tuning of Salient Weights (GIFT-SW). Our method updates only
salient columns, while injecting Gaussian noise into non-salient ones. To
identify these columns, we developeda generalized sensitivity metric that
extends and unifies metrics from previous studies. Experiments with LLaMA
models demonstrate that GIFT-SW outperforms full fine-tuning and modern PEFT
methods under the same computational budget. Moreover, GIFT-SW offers practical
advantages to recover performance of models subjected to mixed-precision
quantization with keeping salient weights in full precision.Summary
AI-Generated Summary