神經網絡擴散
Neural Network Diffusion
February 20, 2024
作者: Kai Wang, Zhaopan Xu, Yukun Zhou, Zelin Zang, Trevor Darrell, Zhuang Liu, Yang You
cs.AI
摘要
擴散模型在圖像和視頻生成方面取得了顯著成功。在這項工作中,我們展示了擴散模型還可以生成高性能的神經網絡參數。我們的方法很簡單,利用自編碼器和標準的潛在擴散模型。自編碼器提取了訓練網絡參數子集的潛在表示。然後訓練擴散模型來從隨機噪聲合成這些潛在參數表示。然後生成新的表示,通過自編碼器的解碼器,其輸出可用作新的網絡參數子集。在各種架構和數據集上,我們的擴散過程始終生成性能相當或更好的模型,而額外成本很小。值得注意的是,我們在實驗中發現生成的模型在性能上與訓練的網絡不同。我們的結果鼓勵更多探索擴散模型的多樣用途。
English
Diffusion models have achieved remarkable success in image and video
generation. In this work, we demonstrate that diffusion models can also
generate high-performing neural network parameters. Our approach is
simple, utilizing an autoencoder and a standard latent diffusion model. The
autoencoder extracts latent representations of a subset of the trained network
parameters. A diffusion model is then trained to synthesize these latent
parameter representations from random noise. It then generates new
representations that are passed through the autoencoder's decoder, whose
outputs are ready to use as new subsets of network parameters. Across various
architectures and datasets, our diffusion process consistently generates models
of comparable or improved performance over trained networks, with minimal
additional cost. Notably, we empirically find that the generated models perform
differently with the trained networks. Our results encourage more exploration
on the versatile use of diffusion models.Summary
AI-Generated Summary