Edify Image: ピクセル空間ラプラシアン拡散モデルを用いた高品質画像生成

要旨

私たちは、ピクセル単位の完璧な精度で写実的な画像コンテンツを生成することができる拡散モデルファミリーであるEdify Imageを紹介します。Edify Imageは、新しいラプラシアン拡散プロセスを用いてトレーニングされた段階的なピクセル空間拡散モデルを利用しており、異なる周波数帯での画像信号が異なる速度で減衰されるプロセスです。Edify Imageは、テキストから画像の合成、4Kのアップサンプリング、ControlNets、360 HDRパノラマの生成、および画像カスタマイズのための微調整など、幅広いアプリケーションをサポートしています。

English

We introduce Edify Image, a family of diffusion models capable of generating photorealistic image content with pixel-perfect accuracy. Edify Image utilizes cascaded pixel-space diffusion models trained using a novel Laplacian diffusion process, in which image signals at different frequency bands are attenuated at varying rates. Edify Image supports a wide range of applications, including text-to-image synthesis, 4K upsampling, ControlNets, 360 HDR panorama generation, and finetuning for image customization.