ChatPaper.aiChatPaper

GIFT-SW:對LLM進行高斯噪音注入微調突出權重

GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs

August 27, 2024
作者: Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov, Egor Venediktov, Mariya Krylova, Aleksandr Zuev, Evgeny Burnaev
cs.AI

摘要

參數高效微調(PEFT)方法已經變得流行並使大型語言模型(LLMs)的使用民主化。最近的研究顯示,一小部分權重顯著影響性能。基於這一觀察,我們引入了一種新的PEFT方法,稱為注入高斯噪聲以微調顯著權重(GIFT-SW)。我們的方法僅更新顯著列,同時在非顯著列中注入高斯噪聲。為了識別這些列,我們開發了一種廣義敏感度度量,擴展並統一了先前研究中的度量。對LLaMA模型的實驗表明,GIFT-SW在相同的計算預算下優於完全微調和現代PEFT方法。此外,GIFT-SW提供了實際優勢,可以在保持完整精度的顯著權重的情況下,恢復經過混合精度量化處理的模型的性能。
English
Parameter Efficient Fine-Tuning (PEFT) methods have gained popularity and democratized the usage of Large Language Models (LLMs). Recent studies have shown that a small subset of weights significantly impacts performance. Based on this observation, we introduce a novel PEFT method, called Gaussian noise Injected Fine Tuning of Salient Weights (GIFT-SW). Our method updates only salient columns, while injecting Gaussian noise into non-salient ones. To identify these columns, we developeda generalized sensitivity metric that extends and unifies metrics from previous studies. Experiments with LLaMA models demonstrate that GIFT-SW outperforms full fine-tuning and modern PEFT methods under the same computational budget. Moreover, GIFT-SW offers practical advantages to recover performance of models subjected to mixed-precision quantization with keeping salient weights in full precision.

Summary

AI-Generated Summary

PDF33November 16, 2024