One-2-3-45++:具有一致多視角生成和3D擴散的快速單圖像到3D物體
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
November 14, 2023
作者: Minghua Liu, Ruoxi Shi, Linghao Chen, Zhuoyang Zhang, Chao Xu, Xinyue Wei, Hansheng Chen, Chong Zeng, Jiayuan Gu, Hao Su
cs.AI
摘要
最近在開放式世界的3D物件生成方面取得了顯著進展,圖像轉3D方法提供了比其文本轉3D對應物更優越的細緻控制。然而,大多數現有模型在同時提供快速生成速度和對輸入圖像高保真度方面仍有不足,這兩個特點對於實際應用至關重要。在本文中,我們提出了One-2-3-45++,一種創新方法,可以將單張圖像轉換為細緻的3D紋理網格,大約需要一分鐘的時間。我們的方法旨在充分利用嵌入在2D擴散模型和有價值但有限的3D數據中的豐富知識。這是通過首先對2D擴散模型進行微調,以實現一致的多視圖圖像生成,然後通過多視圖條件化的3D本地擴散模型將這些圖像提升到3D來實現的。廣泛的實驗評估表明,我們的方法能夠生成高質量、多樣化的3D資產,與原始輸入圖像非常接近。我們的項目網頁:https://sudo-ai-3d.github.io/One2345plus_page。
English
Recent advancements in open-world 3D object generation have been remarkable,
with image-to-3D methods offering superior fine-grained control over their
text-to-3D counterparts. However, most existing models fall short in
simultaneously providing rapid generation speeds and high fidelity to input
images - two features essential for practical applications. In this paper, we
present One-2-3-45++, an innovative method that transforms a single image into
a detailed 3D textured mesh in approximately one minute. Our approach aims to
fully harness the extensive knowledge embedded in 2D diffusion models and
priors from valuable yet limited 3D data. This is achieved by initially
finetuning a 2D diffusion model for consistent multi-view image generation,
followed by elevating these images to 3D with the aid of multi-view conditioned
3D native diffusion models. Extensive experimental evaluations demonstrate that
our method can produce high-quality, diverse 3D assets that closely mirror the
original input image. Our project webpage:
https://sudo-ai-3d.github.io/One2345plus_page.