Diffusion360:基於擴散模型的無縫360度全景圖像生成
Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models
November 22, 2023
作者: Mengyang Feng, Jinlin Liu, Miaomiao Cui, Xuansong Xie
cs.AI
摘要
這是一份關於基於擴散模型的360度全景圖像生成任務的技術報告。不同於普通的2D圖像,360度全景圖像捕捉整個360°×180°的視野。因此,360度全景圖像的最右側和最左側應當連續,這是該領域的主要挑戰。然而,目前的擴散管道並不適用於生成如此無縫的360度全景圖像。為此,我們提出了一種圓形混合策略,用於去噪和VAE解碼階段,以保持幾何連續性。基於此,我們提出了兩個模型,分別用於文本轉360度全景圖像和單圖像轉360度全景圖像任務。代碼已作為開源項目發布在以下網址:
https://github.com/ArcherFMY/SD-T2I-360PanoImage
和
https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary
English
This is a technical report on the 360-degree panoramic image generation task
based on diffusion models. Unlike ordinary 2D images, 360-degree panoramic
images capture the entire 360^circtimes 180^circ field of view. So the
rightmost and the leftmost sides of the 360 panoramic image should be
continued, which is the main challenge in this field. However, the current
diffusion pipeline is not appropriate for generating such a seamless 360-degree
panoramic image. To this end, we propose a circular blending strategy on both
the denoising and VAE decoding stages to maintain the geometry continuity.
Based on this, we present two models for Text-to-360-panoramas and
Single-Image-to-360-panoramas tasks. The code has been released as an
open-source project at
https://github.com/ArcherFMY/SD-T2I-360PanoImage{https://github.com/ArcherFMY/SD-T2I-360PanoImage}
and
https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary{ModelScope}