MetaDreamer:具有幾何和紋理解耦的高效文本生成3D模型

MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture

November 16, 2023
作者: Lincong Feng, Muyu Wang, Maoyu Wang, Kuo Xu, Xiaoli Liu
cs.AI

摘要

通過從2D擴散模型中提煉的先前知識,生成模型在3D物體合成方面取得了顯著進展。然而,在現有的3D合成框架中,仍存在多視角幾何不一致和生成速度緩慢的挑戰。這可以歸因於兩個因素:首先,在優化中幾何先驗知識不足,其次是傳統3D生成方法中幾何和紋理之間的糾纏問題。為此,我們引入MetaDreammer,一種利用豐富的2D和3D先前知識的兩階段優化方法。在第一階段,我們強調優化幾何表示,以確保3D物體的多視角一致性和準確性。在第二階段,我們專注於微調幾何和優化紋理,從而實現更精細的3D物體。通過分別在兩個階段利用2D和3D先前知識,我們有效地減輕了幾何和紋理之間的相互依賴性。MetaDreamer為每個階段確立清晰的優化目標,從而在3D生成過程中節省了大量時間。最終,MetaDreamer可以根據文本提示在20分鐘內生成高質量的3D物體,據我們所知,這是最高效的文本到3D生成方法。此外,我們將圖像控制引入到過程中,增強了3D生成的可控性。大量實證證據證實,我們的方法不僅高效,而且實現了處於當前最先進3D生成技術前沿的質量水平。
English
Generative models for 3D object synthesis have seen significant advancements with the incorporation of prior knowledge distilled from 2D diffusion models. Nevertheless, challenges persist in the form of multi-view geometric inconsistencies and slow generation speeds within the existing 3D synthesis frameworks. This can be attributed to two factors: firstly, the deficiency of abundant geometric a priori knowledge in optimization, and secondly, the entanglement issue between geometry and texture in conventional 3D generation methods.In response, we introduce MetaDreammer, a two-stage optimization approach that leverages rich 2D and 3D prior knowledge. In the first stage, our emphasis is on optimizing the geometric representation to ensure multi-view consistency and accuracy of 3D objects. In the second stage, we concentrate on fine-tuning the geometry and optimizing the texture, thereby achieving a more refined 3D object. Through leveraging 2D and 3D prior knowledge in two stages, respectively, we effectively mitigate the interdependence between geometry and texture. MetaDreamer establishes clear optimization objectives for each stage, resulting in significant time savings in the 3D generation process. Ultimately, MetaDreamer can generate high-quality 3D objects based on textual prompts within 20 minutes, and to the best of our knowledge, it is the most efficient text-to-3D generation method. Furthermore, we introduce image control into the process, enhancing the controllability of 3D generation. Extensive empirical evidence confirms that our method is not only highly efficient but also achieves a quality level that is at the forefront of current state-of-the-art 3D generation techniques.
PDF181December 15, 2024