Diffusion360: 拡散モデルに基づくシームレスな360度パノラマ画像生成

要旨

これは拡散モデルに基づく360度パノラマ画像生成タスクに関する技術報告書である。通常の2D画像とは異なり、360度パノラマ画像は360度×180度の視野全体を捉える。そのため、360度パノラマ画像の最右端と最左端は連続している必要があり、これがこの分野の主な課題となっている。しかし、現在の拡散パイプラインは、このようなシームレスな360度パノラマ画像を生成するには適していない。この問題を解決するため、我々は幾何学的連続性を維持するために、ノイズ除去とVAEデコードの両段階で円形ブレンディング戦略を提案する。これに基づき、テキストから360度パノラマへの変換と、単一画像から360度パノラマへの変換の2つのモデルを提示する。コードはオープンソースプロジェクトとして以下で公開されている。 https://github.com/ArcherFMY/SD-T2I-360PanoImage および https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary

English

This is a technical report on the 360-degree panoramic image generation task based on diffusion models. Unlike ordinary 2D images, 360-degree panoramic images capture the entire 360^circtimes 180^circ field of view. So the rightmost and the leftmost sides of the 360 panoramic image should be continued, which is the main challenge in this field. However, the current diffusion pipeline is not appropriate for generating such a seamless 360-degree panoramic image. To this end, we propose a circular blending strategy on both the denoising and VAE decoding stages to maintain the geometry continuity. Based on this, we present two models for Text-to-360-panoramas and Single-Image-to-360-panoramas tasks. The code has been released as an open-source project at https://github.com/ArcherFMY/SD-T2I-360PanoImage{https://github.com/ArcherFMY/SD-T2I-360PanoImage} and https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary{ModelScope}

Diffusion360: 拡散モデルに基づくシームレスな360度パノラマ画像生成

Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models

要旨

Support