ChatPaper.aiChatPaper

LDM3D-VR:用于3D虚拟现实的潜在扩散模型

LDM3D-VR: Latent Diffusion Model for 3D VR

November 6, 2023
作者: Gabriela Ben Melech Stan, Diana Wofk, Estelle Aflalo, Shao-Yen Tseng, Zhipeng Cai, Michael Paulitsch, Vasudev Lal
cs.AI

摘要

潜在扩散模型已被证明在生成和操纵视觉输出方面处于最先进水平。然而据我们所知,与RGB同时生成深度图的能力仍然有限。我们引入了LDM3D-VR,这是一个针对虚拟现实开发的扩散模型套件,包括LDM3D-pano和LDM3D-SR。这些模型能够基于文本提示生成全景RGBD,并将低分辨率输入升级为高分辨率RGBD。我们的模型是在包含全景/高分辨率RGB图像、深度图和标题的数据集上,从现有预训练模型微调而来。这两个模型与现有相关方法进行了评估比较。
English
Latent diffusion models have proven to be state-of-the-art in the creation and manipulation of visual outputs. However, as far as we know, the generation of depth maps jointly with RGB is still limited. We introduce LDM3D-VR, a suite of diffusion models targeting virtual reality development that includes LDM3D-pano and LDM3D-SR. These models enable the generation of panoramic RGBD based on textual prompts and the upscaling of low-resolution inputs to high-resolution RGBD, respectively. Our models are fine-tuned from existing pretrained models on datasets containing panoramic/high-resolution RGB images, depth maps and captions. Both models are evaluated in comparison to existing related methods.
PDF111December 15, 2024