Diffusion360:基于扩散模型的无缝360度全景图像生成
Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models
November 22, 2023
作者: Mengyang Feng, Jinlin Liu, Miaomiao Cui, Xuansong Xie
cs.AI
摘要
这是关于基于扩散模型的360度全景图像生成任务的技术报告。与普通的2D图像不同,360度全景图像捕捉整个360°×180°的视野。因此,360全景图像的最右侧和最左侧应该是连续的,这是该领域的主要挑战。然而,当前的扩散流程不适合生成这样无缝的360度全景图像。为此,我们提出了一种在去噪和VAE解码阶段都采用圆形混合策略以保持几何连续性的方法。基于此,我们提出了两种模型,用于文本到360全景图和单图像到360全景图的任务。代码已作为开源项目发布在https://github.com/ArcherFMY/SD-T2I-360PanoImage 和 https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary。
English
This is a technical report on the 360-degree panoramic image generation task
based on diffusion models. Unlike ordinary 2D images, 360-degree panoramic
images capture the entire 360^circtimes 180^circ field of view. So the
rightmost and the leftmost sides of the 360 panoramic image should be
continued, which is the main challenge in this field. However, the current
diffusion pipeline is not appropriate for generating such a seamless 360-degree
panoramic image. To this end, we propose a circular blending strategy on both
the denoising and VAE decoding stages to maintain the geometry continuity.
Based on this, we present two models for Text-to-360-panoramas and
Single-Image-to-360-panoramas tasks. The code has been released as an
open-source project at
https://github.com/ArcherFMY/SD-T2I-360PanoImage{https://github.com/ArcherFMY/SD-T2I-360PanoImage}
and
https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary{ModelScope}