ChatPaper.aiChatPaper

SaRA:具有漸進稀疏低秩適應的高效擴散模型微調

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

September 10, 2024
作者: Teng Hu, Jiangning Zhang, Ran Yi, Hongrui Huang, Yabiao Wang, Lizhuang Ma
cs.AI

摘要

近年來,擴散模型的發展在圖像和視頻生成任務中取得了顯著進展,像是Stable Diffusion系列這樣的預訓練模型發揮了關鍵作用。受到模型修剪的啟發,該方法通過刪除不重要的參數來簡化大型預訓練模型,我們提出了一種新穎的模型微調方法,以充分利用這些無效參數,並賦予預訓練模型新的任務特定能力。在這項工作中,我們首先研究了預訓練擴散模型中參數的重要性,發現絕對值最小的10%至20%的參數對生成過程沒有貢獻。基於這一觀察,我們提出了一種名為SaRA的方法,重新利用這些暫時無效的參數,相當於優化一個稀疏權重矩陣以學習任務特定知識。為了減輕過度擬合,我們提出了一種基於核范數的低秩稀疏訓練方案,用於有效微調。此外,我們設計了一種新的漸進式參數調整策略,以充分利用重新訓練/微調的參數。最後,我們提出了一種新穎的非結構化反向傳播策略,在微調過程中顯著降低了內存成本。我們的方法增強了預訓練模型在下游應用中的生成能力,並在保持模型泛化能力方面優於LoRA等傳統微調方法。我們通過對SD模型進行微調實驗來驗證我們的方法,展示了顯著的改進。SaRA還提供了一個實際優勢,只需修改一行代碼即可進行高效實現,並與現有方法無縫兼容。
English
In recent years, the development of diffusion models has led to significant progress in image and video generation tasks, with pre-trained models like the Stable Diffusion series playing a crucial role. Inspired by model pruning which lightens large pre-trained models by removing unimportant parameters, we propose a novel model fine-tuning method to make full use of these ineffective parameters and enable the pre-trained model with new task-specified capabilities. In this work, we first investigate the importance of parameters in pre-trained diffusion models, and discover that the smallest 10% to 20% of parameters by absolute values do not contribute to the generation process. Based on this observation, we propose a method termed SaRA that re-utilizes these temporarily ineffective parameters, equating to optimizing a sparse weight matrix to learn the task-specific knowledge. To mitigate overfitting, we propose a nuclear-norm-based low-rank sparse training scheme for efficient fine-tuning. Furthermore, we design a new progressive parameter adjustment strategy to make full use of the re-trained/finetuned parameters. Finally, we propose a novel unstructural backpropagation strategy, which significantly reduces memory costs during fine-tuning. Our method enhances the generative capabilities of pre-trained models in downstream applications and outperforms traditional fine-tuning methods like LoRA in maintaining model's generalization ability. We validate our approach through fine-tuning experiments on SD models, demonstrating significant improvements. SaRA also offers a practical advantage that requires only a single line of code modification for efficient implementation and is seamlessly compatible with existing methods.

Summary

AI-Generated Summary

PDF152November 16, 2024