神经变形
Neural Metamorphosis
October 10, 2024
作者: Xingyi Yang, Xinchao Wang
cs.AI
摘要
本文介绍了一种名为神经变形(NeuMeta)的新学习范式,旨在构建自变形神经网络。与为不同架构或大小创建单独模型不同,NeuMeta直接学习神经网络的连续权重流形。一旦训练完成,我们可以直接从流形中对任何大小的网络进行权重采样,甚至对以前未见过的配置,无需重新训练。为实现这一宏伟目标,NeuMeta训练神经隐式函数作为超网络。它们接受模型空间内的坐标作为输入,并在流形上生成相应的权重值。换句话说,隐式函数是以一种方式学习的,使得预测的权重在各种模型大小上表现良好。在训练这些模型时,我们注意到最终性能与学习流形的平滑度密切相关。为了增强这种平滑度,我们采用了两种策略。首先,我们对权重矩阵进行排列以实现模型内平滑度,通过解决最短哈密顿路径问题。此外,在训练隐式函数时,我们在输入坐标上添加噪声,确保各种大小的模型显示一致的输出。因此,NeuMeta在合成各种网络配置的参数方面表现出有希望的结果。我们在图像分类、语义分割和图像生成方面进行了广泛测试,结果显示NeuMeta即使在75%的压缩率下仍能保持全尺寸性能。
English
This paper introduces a new learning paradigm termed Neural Metamorphosis
(NeuMeta), which aims to build self-morphable neural networks. Contrary to
crafting separate models for different architectures or sizes, NeuMeta directly
learns the continuous weight manifold of neural networks. Once trained, we can
sample weights for any-sized network directly from the manifold, even for
previously unseen configurations, without retraining. To achieve this ambitious
goal, NeuMeta trains neural implicit functions as hypernetworks. They accept
coordinates within the model space as input, and generate corresponding weight
values on the manifold. In other words, the implicit function is learned in a
way, that the predicted weights is well-performed across various models sizes.
In training those models, we notice that, the final performance closely relates
on smoothness of the learned manifold. In pursuit of enhancing this smoothness,
we employ two strategies. First, we permute weight matrices to achieve
intra-model smoothness, by solving the Shortest Hamiltonian Path problem.
Besides, we add a noise on the input coordinates when training the implicit
function, ensuring models with various sizes shows consistent outputs. As such,
NeuMeta shows promising results in synthesizing parameters for various network
configurations. Our extensive tests in image classification, semantic
segmentation, and image generation reveal that NeuMeta sustains full-size
performance even at a 75% compression rate.Summary
AI-Generated Summary