恆等生成網絡

摘要

我們提出了一種基於訓練神經網絡成為幂等的生成建模新方法。幂等運算子是指可以按順序應用而不會改變結果超出初始應用的運算子，即 f(f(z))=f(z)。所提出的模型 f 被訓練來將來源分佈（例如，高斯噪聲）映射到目標分佈（例如，逼真圖像），使用以下目標：(1) 來自目標分佈的實例應該映射到自身，即 f(x)=x。我們將目標流形定義為所有 f 映射到自身的實例集合。(2) 來自來源分佈的實例應該映射到所定義的目標流形上。這是通過優化幂等性項目 f(f(z))=f(z) 實現的，這鼓勵 f(z) 的範圍在目標流形上。在理想假設下，這樣的過程可以證明收斂到目標分佈。這種策略導致一個能夠在一步中生成輸出的模型，保持一致的潛在空間，同時還允許進行逐步應用以進行改進。此外，我們發現通過處理來自目標和來源分佈的輸入，該模型能夠熟練地將損壞或修改的數據投影回目標流形。這項工作是邁向“全局投影器”的第一步，該投影器能夠將任何輸入投影到目標數據分佈中。

English

We propose a new approach for generative modeling based on training a neural network to be idempotent. An idempotent operator is one that can be applied sequentially without changing the result beyond the initial application, namely f(f(z))=f(z). The proposed model f is trained to map a source distribution (e.g, Gaussian noise) to a target distribution (e.g. realistic images) using the following objectives: (1) Instances from the target distribution should map to themselves, namely f(x)=x. We define the target manifold as the set of all instances that f maps to themselves. (2) Instances that form the source distribution should map onto the defined target manifold. This is achieved by optimizing the idempotence term, f(f(z))=f(z) which encourages the range of f(z) to be on the target manifold. Under ideal assumptions such a process provably converges to the target distribution. This strategy results in a model capable of generating an output in one step, maintaining a consistent latent space, while also allowing sequential applications for refinement. Additionally, we find that by processing inputs from both target and source distributions, the model adeptly projects corrupted or modified data back to the target manifold. This work is a first step towards a ``global projector'' that enables projecting any input into a target data distribution.

恆等生成網絡

Idempotent Generative Network

摘要

Support