ChatPaper.aiChatPaper

具有去噪神經權重的高效訓練

Efficient Training with Denoised Neural Weights

July 16, 2024
作者: Yifan Gong, Zheng Zhan, Yanyu Li, Yerlan Idelbayev, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren
cs.AI

摘要

良好的權重初始化是降低深度神經網絡(DNN)模型訓練成本的有效措施。如何初始化參數的選擇具有挑戰性,可能需要手動調整,這可能耗時且容易出現人為錯誤。為克服這些限制,本研究采取了一個新穎的方法,建立了一個權重生成器來合成初始化的神經網絡權重。我們以生成對抗網絡(GANs)的圖像到圖像翻譯任務作為例子,因為這樣易於收集跨越廣泛範圍的模型權重。具體來說,我們首先收集包含各種圖像編輯概念及其對應訓練權重的數據集,後來用於訓練權重生成器。為了應對不同層之間的特徵差異和需要預測的大量權重,我們將權重分成相等大小的塊並為每個塊分配一個索引。隨後,使用概念的文本條件和塊索引,通過這樣的數據集訓練擴散模型。通過使用我們的擴散模型預測的去噪權重初始化圖像翻譯模型,訓練僅需43.3秒。與從頭開始訓練(即 Pix2pix)相比,我們在獲得更好的圖像生成質量的同時,實現了新概念的15倍訓練時間加速。
English
Good weight initialization serves as an effective measure to reduce the training cost of a deep neural network (DNN) model. The choice of how to initialize parameters is challenging and may require manual tuning, which can be time-consuming and prone to human error. To overcome such limitations, this work takes a novel step towards building a weight generator to synthesize the neural weights for initialization. We use the image-to-image translation task with generative adversarial networks (GANs) as an example due to the ease of collecting model weights spanning a wide range. Specifically, we first collect a dataset with various image editing concepts and their corresponding trained weights, which are later used for the training of the weight generator. To address the different characteristics among layers and the substantial number of weights to be predicted, we divide the weights into equal-sized blocks and assign each block an index. Subsequently, a diffusion model is trained with such a dataset using both text conditions of the concept and the block indexes. By initializing the image translation model with the denoised weights predicted by our diffusion model, the training requires only 43.3 seconds. Compared to training from scratch (i.e., Pix2pix), we achieve a 15x training time acceleration for a new concept while obtaining even better image generation quality.
PDF93November 28, 2024