NeuroAda：激活每個神經元的潛力，實現參數高效微調

摘要

現有的參數高效微調（PEFT）方法主要分為兩類：基於添加的方法和選擇性原位適應方法。前者，如LoRA，通過引入額外模組來使模型適應下游任務，具有較高的記憶體效率。然而，其表示能力往往受限，使其不太適合細粒度的適應。相比之下，後者直接微調原始模型參數中精心選擇的子集，允許更精確和有效的適應，但代價是顯著增加的記憶體消耗。為了解決這一權衡問題，我們提出了NeuroAda，這是一種新穎的PEFT方法，能夠在保持高記憶體效率的同時實現細粒度的模型微調。我們的方法首先像選擇性適應一樣識別重要參數（即網絡內的連接），然後為這些選定的參數引入旁路連接。在微調過程中，僅更新旁路連接，保持原始模型參數凍結。在涵蓋自然語言生成和理解等23+個任務上的實驗結果表明，NeuroAda僅需leq 0.02%的可訓練參數即可實現最先進的性能，同時將CUDA記憶體使用量減少高達60%。我們在此發布我們的代碼：https://github.com/FightingFighting/NeuroAda.git。

English

Existing parameter-efficient fine-tuning (PEFT) methods primarily fall into two categories: addition-based and selective in-situ adaptation. The former, such as LoRA, introduce additional modules to adapt the model to downstream tasks, offering strong memory efficiency. However, their representational capacity is often limited, making them less suitable for fine-grained adaptation. In contrast, the latter directly fine-tunes a carefully chosen subset of the original model parameters, allowing for more precise and effective adaptation, but at the cost of significantly increased memory consumption. To reconcile this trade-off, we propose NeuroAda, a novel PEFT method that enables fine-grained model finetuning while maintaining high memory efficiency. Our approach first identifies important parameters (i.e., connections within the network) as in selective adaptation, and then introduces bypass connections for these selected parameters. During finetuning, only the bypass connections are updated, leaving the original model parameters frozen. Empirical results on 23+ tasks spanning both natural language generation and understanding demonstrate that NeuroAda achieves state-of-the-art performance with as little as leq 0.02% trainable parameters, while reducing CUDA memory usage by up to 60%. We release our code here: https://github.com/FightingFighting/NeuroAda.git.

NeuroAda：激活每個神經元的潛力，實現參數高效微調

NeuroAda: Activating Each Neuron's Potential for Parameter-Efficient Fine-Tuning

摘要

Support