ChatPaper.aiChatPaper

GIFT-SW:用高斯噪声注入的显著权重微调对LLMs进行微调

GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs

August 27, 2024
作者: Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov, Egor Venediktov, Mariya Krylova, Aleksandr Zuev, Evgeny Burnaev
cs.AI

摘要

参数高效微调(PEFT)方法已经变得流行,并使得大型语言模型(LLM)的使用变得更加民主化。最近的研究表明,一小部分权重显著影响性能。基于这一观察,我们引入了一种新颖的PEFT方法,名为注入高斯噪声以微调显著权重(GIFT-SW)。我们的方法仅更新显著列,同时向非显著列注入高斯噪声。为了识别这些列,我们开发了一个广义敏感度度量,扩展并统一了先前研究中的度量标准。对LLaMA模型的实验表明,GIFT-SW在相同的计算预算下优于完全微调和现代PEFT方法。此外,GIFT-SW在实践中具有优势,可以恢复经过混合精度量化的模型的性能,并保持显著权重的全精度。
English
Parameter Efficient Fine-Tuning (PEFT) methods have gained popularity and democratized the usage of Large Language Models (LLMs). Recent studies have shown that a small subset of weights significantly impacts performance. Based on this observation, we introduce a novel PEFT method, called Gaussian noise Injected Fine Tuning of Salient Weights (GIFT-SW). Our method updates only salient columns, while injecting Gaussian noise into non-salient ones. To identify these columns, we developeda generalized sensitivity metric that extends and unifies metrics from previous studies. Experiments with LLaMA models demonstrate that GIFT-SW outperforms full fine-tuning and modern PEFT methods under the same computational budget. Moreover, GIFT-SW offers practical advantages to recover performance of models subjected to mixed-precision quantization with keeping salient weights in full precision.

Summary

AI-Generated Summary

PDF33November 16, 2024