MVDream：多視圖擴散用於3D生成

摘要

我們提出了 MVDream，一種多視角擴散模型，能夠從給定的文本提示生成幾何一致的多視角圖像。通過利用在大規模網絡數據集上預先訓練的圖像擴散模型和從3D資產渲染的多視角數據集，所得的多視角擴散模型可以同時實現2D擴散的通用性和3D數據的一致性。這種模型因此可以應用為3D生成的多視角先驗，通過得分蒸餾抽樣，在解決3D一致性問題的同時極大地改善現有的2D提升方法的穩定性。最後，我們展示了多視角擴散模型也可以在少樣本設置下進行微調，用於個性化的3D生成，即DreamBooth3D應用程序，在學習主題身份後可以保持一致性。

English

We propose MVDream, a multi-view diffusion model that is able to generate geometrically consistent multi-view images from a given text prompt. By leveraging image diffusion models pre-trained on large-scale web datasets and a multi-view dataset rendered from 3D assets, the resulting multi-view diffusion model can achieve both the generalizability of 2D diffusion and the consistency of 3D data. Such a model can thus be applied as a multi-view prior for 3D generation via Score Distillation Sampling, where it greatly improves the stability of existing 2D-lifting methods by solving the 3D consistency problem. Finally, we show that the multi-view diffusion model can also be fine-tuned under a few shot setting for personalized 3D generation, i.e. DreamBooth3D application, where the consistency can be maintained after learning the subject identity.

MVDream：多視圖擴散用於3D生成

MVDream: Multi-view Diffusion for 3D Generation

摘要

Support