MVDream：用于3D生成的多视角扩散

摘要

我们提出了MVDream，这是一个多视角扩散模型，能够从给定的文本提示生成几何一致的多视角图像。通过利用在大规模网络数据集上预训练的图像扩散模型和从3D资产渲染的多视角数据集，最终的多视角扩散模型可以实现2D扩散的泛化性和3D数据的一致性。这样的模型可以作为3D生成的多视角先验，通过评分蒸馏采样，极大地改善现有的2D提升方法的稳定性，解决了3D一致性问题。最后，我们展示了多视角扩散模型也可以在少样本设置下进行微调，用于个性化的3D生成，即DreamBooth3D应用程序，在学习主体身份后仍能保持一致性。

English

We propose MVDream, a multi-view diffusion model that is able to generate geometrically consistent multi-view images from a given text prompt. By leveraging image diffusion models pre-trained on large-scale web datasets and a multi-view dataset rendered from 3D assets, the resulting multi-view diffusion model can achieve both the generalizability of 2D diffusion and the consistency of 3D data. Such a model can thus be applied as a multi-view prior for 3D generation via Score Distillation Sampling, where it greatly improves the stability of existing 2D-lifting methods by solving the 3D consistency problem. Finally, we show that the multi-view diffusion model can also be fine-tuned under a few shot setting for personalized 3D generation, i.e. DreamBooth3D application, where the consistency can be maintained after learning the subject identity.

MVDream：用于3D生成的多视角扩散

MVDream: Multi-view Diffusion for 3D Generation

摘要

Support