神經網絡擴散

摘要

擴散模型在圖像和視頻生成方面取得了顯著成功。在這項工作中，我們展示了擴散模型還可以生成高性能的神經網絡參數。我們的方法很簡單，利用自編碼器和標準的潛在擴散模型。自編碼器提取了訓練網絡參數子集的潛在表示。然後訓練擴散模型來從隨機噪聲合成這些潛在參數表示。然後生成新的表示，通過自編碼器的解碼器，其輸出可用作新的網絡參數子集。在各種架構和數據集上，我們的擴散過程始終生成性能相當或更好的模型，而額外成本很小。值得注意的是，我們在實驗中發現生成的模型在性能上與訓練的網絡不同。我們的結果鼓勵更多探索擴散模型的多樣用途。

English

Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also generate high-performing neural network parameters. Our approach is simple, utilizing an autoencoder and a standard latent diffusion model. The autoencoder extracts latent representations of a subset of the trained network parameters. A diffusion model is then trained to synthesize these latent parameter representations from random noise. It then generates new representations that are passed through the autoencoder's decoder, whose outputs are ready to use as new subsets of network parameters. Across various architectures and datasets, our diffusion process consistently generates models of comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models perform differently with the trained networks. Our results encourage more exploration on the versatile use of diffusion models.

神經網絡擴散

Neural Network Diffusion

摘要

Support