通过生成表示实现自我条件图像生成
Self-conditioned Image Generation via Generating Representations
December 6, 2023
作者: Tianhong Li, Dina Katabi, Kaiming He
cs.AI
摘要
本文介绍了表示条件图像生成(RCG),这是一个简单而有效的图像生成框架,在无类别条件下的图像生成方面树立了新的基准。RCG不依赖于任何人类注释,而是依赖于从图像分布映射而来的经过预训练编码器映射的自监督表示分布。在生成过程中,RCG使用表示扩散模型(RDM)从这种表示分布中进行采样,并利用像素生成器根据采样的表示来生成图像像素。这种设计在生成过程中提供了重要的指导,从而实现了高质量的图像生成。在ImageNet 256×256上进行测试,RCG实现了Frechet Inception Distance(FID)为3.31和Inception Score(IS)为253.4。这些结果不仅显著改进了无类别条件图像生成的最新技术水平,还与当前领先的有类别条件图像生成方法相媲美,弥合了这两个任务之间长期存在的性能差距。代码可在https://github.com/LTH14/rcg找到。
English
This paper presents Representation-Conditioned image
Generation (RCG), a simple yet effective image generation framework
which sets a new benchmark in class-unconditional image generation. RCG does
not condition on any human annotations. Instead, it conditions on a
self-supervised representation distribution which is mapped from the image
distribution using a pre-trained encoder. During generation, RCG samples from
such representation distribution using a representation diffusion model (RDM),
and employs a pixel generator to craft image pixels conditioned on the sampled
representation. Such a design provides substantial guidance during the
generative process, resulting in high-quality image generation. Tested on
ImageNet 256times256, RCG achieves a Frechet Inception Distance (FID) of
3.31 and an Inception Score (IS) of 253.4. These results not only significantly
improve the state-of-the-art of class-unconditional image generation but also
rival the current leading methods in class-conditional image generation,
bridging the long-standing performance gap between these two tasks. Code is
available at https://github.com/LTH14/rcg.