ニューラルメタモルフォーゼ

要旨

本論文では、Neural Metamorphosis（NeuMeta）と呼ばれる新しい学習パラダイムを紹介し、自己変形可能なニューラルネットワークを構築することを目指しています。異なるアーキテクチャやサイズのために別々のモデルを作成する代わりに、NeuMetaはニューラルネットワークの連続的な重み多様体を直接学習します。訓練されると、我々は再トレーニングすることなく、以前に見たことのない構成に対しても、多様体から直接任意のサイズのネットワークの重みをサンプリングできます。この野心的な目標を達成するために、NeuMetaはハイパーネットワークとしてニューラル暗黙関数を訓練します。これらはモデル空間内の座標を入力とし、対応する重み値を多様体上に生成します。言い換えれば、予測された重みがさまざまなモデルサイズで優れた性能を発揮するように、暗黙関数が学習されます。これらのモデルを訓練する際に、学習された多様体の滑らかさが最終的なパフォーマンスに密接に関連していることに気づきました。この滑らかさを向上させるために、2つの戦略を採用しています。まず、最短ハミルトン経路問題を解決することで、モデル内の滑らかさを達成するために重み行列を置換します。さらに、暗黙関数の訓練時に入力座標にノイズを加えることで、さまざまなサイズのモデルが一貫した出力を示すようにしています。このように、NeuMetaはさまざまなネットワーク構成のパラメータを合成する上で有望な結果を示しています。画像分類、意味的セグメンテーション、画像生成の幅広いテストにより、NeuMetaは75％の圧縮率でも完全サイズのパフォーマンスを維持することが示されました。

English

This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks. Contrary to crafting separate models for different architectures or sizes, NeuMeta directly learns the continuous weight manifold of neural networks. Once trained, we can sample weights for any-sized network directly from the manifold, even for previously unseen configurations, without retraining. To achieve this ambitious goal, NeuMeta trains neural implicit functions as hypernetworks. They accept coordinates within the model space as input, and generate corresponding weight values on the manifold. In other words, the implicit function is learned in a way, that the predicted weights is well-performed across various models sizes. In training those models, we notice that, the final performance closely relates on smoothness of the learned manifold. In pursuit of enhancing this smoothness, we employ two strategies. First, we permute weight matrices to achieve intra-model smoothness, by solving the Shortest Hamiltonian Path problem. Besides, we add a noise on the input coordinates when training the implicit function, ensuring models with various sizes shows consistent outputs. As such, NeuMeta shows promising results in synthesizing parameters for various network configurations. Our extensive tests in image classification, semantic segmentation, and image generation reveal that NeuMeta sustains full-size performance even at a 75% compression rate.