Diffusion360：基於擴散模型的無縫360度全景圖像生成

摘要

這是一份關於基於擴散模型的360度全景圖像生成任務的技術報告。不同於普通的2D圖像，360度全景圖像捕捉整個360°×180°的視野。因此，360度全景圖像的最右側和最左側應當連續，這是該領域的主要挑戰。然而，目前的擴散管道並不適用於生成如此無縫的360度全景圖像。為此，我們提出了一種圓形混合策略，用於去噪和VAE解碼階段，以保持幾何連續性。基於此，我們提出了兩個模型，分別用於文本轉360度全景圖像和單圖像轉360度全景圖像任務。代碼已作為開源項目發布在以下網址： https://github.com/ArcherFMY/SD-T2I-360PanoImage 和 https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary

English

This is a technical report on the 360-degree panoramic image generation task based on diffusion models. Unlike ordinary 2D images, 360-degree panoramic images capture the entire 360^circtimes 180^circ field of view. So the rightmost and the leftmost sides of the 360 panoramic image should be continued, which is the main challenge in this field. However, the current diffusion pipeline is not appropriate for generating such a seamless 360-degree panoramic image. To this end, we propose a circular blending strategy on both the denoising and VAE decoding stages to maintain the geometry continuity. Based on this, we present two models for Text-to-360-panoramas and Single-Image-to-360-panoramas tasks. The code has been released as an open-source project at https://github.com/ArcherFMY/SD-T2I-360PanoImage{https://github.com/ArcherFMY/SD-T2I-360PanoImage} and https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary{ModelScope}

Diffusion360：基於擴散模型的無縫360度全景圖像生成

Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models

摘要

Support