利用文本到图像生成模型进行无监督的组合概念发现
Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models
June 8, 2023
作者: Nan Liu, Yilun Du, Shuang Li, Joshua B. Tenenbaum, Antonio Torralba
cs.AI
摘要
文本到图像生成模型已经实现了跨不同领域的高分辨率图像合成,但需要用户指定他们希望生成的内容。在本文中,我们考虑了逆问题 -- 给定一组不同的图像,我们能否发现代表每个图像的生成概念?我们提出了一种无监督方法,从图像集合中发现生成概念,将绘画中的不同艺术风格、物体和照明与厨房场景中的内容解开,并在给定ImageNet图像的情况下发现图像类别。我们展示了这些生成概念如何准确地表现图像的内容,如何重新组合和组合以生成新的艺术和混合图像,并进一步用作下游分类任务的表示。
English
Text-to-image generative models have enabled high-resolution image synthesis
across different domains, but require users to specify the content they wish to
generate. In this paper, we consider the inverse problem -- given a collection
of different images, can we discover the generative concepts that represent
each image? We present an unsupervised approach to discover generative concepts
from a collection of images, disentangling different art styles in paintings,
objects, and lighting from kitchen scenes, and discovering image classes given
ImageNet images. We show how such generative concepts can accurately represent
the content of images, be recombined and composed to generate new artistic and
hybrid images, and be further used as a representation for downstream
classification tasks.