解释定制扩散模型的权重空间
Interpreting the Weight Space of Customized Diffusion Models
June 13, 2024
作者: Amil Dravid, Yossi Gandelsman, Kuan-Chieh Wang, Rameen Abdal, Gordon Wetzstein, Alexei A. Efros, Kfir Aberman
cs.AI
摘要
我们研究了由大量定制扩散模型所覆盖的权重空间。我们通过创建一个包含超过60,000个模型的数据集来填充这个空间,每个模型都是一个基础模型,经过微调以插入不同人的视觉身份。我们将这些权重的基础流形建模为一个子空间,我们称之为权重对权重。我们展示了这个空间的三个直接应用——采样、编辑和反演。首先,由于空间中的每个点对应一个身份,从中采样一组权重会导致一个编码新身份的模型。接下来,我们找到了在这个空间中对应于身份语义编辑的线性方向(例如,添加胡须)。这些编辑在生成的样本中的外观中保持一致。最后,我们展示将单个图像反演到这个空间会重建一个逼真的身份,即使输入图像不在分布范围内(例如,一幅绘画)。我们的结果表明,经过微调的扩散模型的权重空间表现出作为身份可解释潜在空间的特性。
English
We investigate the space of weights spanned by a large collection of
customized diffusion models. We populate this space by creating a dataset of
over 60,000 models, each of which is a base model fine-tuned to insert a
different person's visual identity. We model the underlying manifold of these
weights as a subspace, which we term weights2weights. We demonstrate three
immediate applications of this space -- sampling, editing, and inversion.
First, as each point in the space corresponds to an identity, sampling a set of
weights from it results in a model encoding a novel identity. Next, we find
linear directions in this space corresponding to semantic edits of the identity
(e.g., adding a beard). These edits persist in appearance across generated
samples. Finally, we show that inverting a single image into this space
reconstructs a realistic identity, even if the input image is out of
distribution (e.g., a painting). Our results indicate that the weight space of
fine-tuned diffusion models behaves as an interpretable latent space of
identities.Summary
AI-Generated Summary