GIFT-SW:用高斯噪声注入的显著权重微调对LLMs进行微调
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
August 27, 2024
作者: Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov, Egor Venediktov, Mariya Krylova, Aleksandr Zuev, Evgeny Burnaev
cs.AI
摘要
参数高效微调(PEFT)方法已经变得流行,并使得大型语言模型(LLM)的使用变得更加民主化。最近的研究表明,一小部分权重显著影响性能。基于这一观察,我们引入了一种新颖的PEFT方法,名为注入高斯噪声以微调显著权重(GIFT-SW)。我们的方法仅更新显著列,同时向非显著列注入高斯噪声。为了识别这些列,我们开发了一个广义敏感度度量,扩展并统一了先前研究中的度量标准。对LLaMA模型的实验表明,GIFT-SW在相同的计算预算下优于完全微调和现代PEFT方法。此外,GIFT-SW在实践中具有优势,可以恢复经过混合精度量化的模型的性能,并保持显著权重的全精度。
English
Parameter Efficient Fine-Tuning (PEFT) methods have gained popularity and
democratized the usage of Large Language Models (LLMs). Recent studies have
shown that a small subset of weights significantly impacts performance. Based
on this observation, we introduce a novel PEFT method, called Gaussian noise
Injected Fine Tuning of Salient Weights (GIFT-SW). Our method updates only
salient columns, while injecting Gaussian noise into non-salient ones. To
identify these columns, we developeda generalized sensitivity metric that
extends and unifies metrics from previous studies. Experiments with LLaMA
models demonstrate that GIFT-SW outperforms full fine-tuning and modern PEFT
methods under the same computational budget. Moreover, GIFT-SW offers practical
advantages to recover performance of models subjected to mixed-precision
quantization with keeping salient weights in full precision.Summary
AI-Generated Summary