ChatPaper.aiChatPaper

IM-3D:用於高品質3D生成的迭代式多視圖擴散和重建。

IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

February 13, 2024
作者: Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Natalia Neverova, Andrea Vedaldi, Oran Gafni, Filippos Kokkinos
cs.AI

摘要

大多數文本轉3D生成器都是基於數十億圖像訓練的現成文本轉圖像模型。它們使用Score Distillation Sampling(SDS)的變體,這種方法速度較慢,有些不穩定,並容易產生瑕疵。一種緩解方法是對2D生成器進行微調,使其具有多視角意識,這有助於提煉,或者可以與重建網絡結合,直接輸出3D物體。在本文中,我們進一步探索文本轉3D模型的設計空間。我們通過考慮視頻而不是圖像生成器,顯著改善了多視角生成。結合使用高斯濺射的3D重建算法,可以優化穩健的基於圖像的損失,我們可以直接從生成的視圖中產生高質量的3D輸出。我們的新方法IM-3D將2D生成器網絡的評估次數降低了10-100倍,從而實現了更高效的流程,更好的質量,更少的幾何不一致性,以及更高的可用3D資產產出率。
English
Most text-to-3D generators build upon off-the-shelf text-to-image models trained on billions of images. They use variants of Score Distillation Sampling (SDS), which is slow, somewhat unstable, and prone to artifacts. A mitigation is to fine-tune the 2D generator to be multi-view aware, which can help distillation or can be combined with reconstruction networks to output 3D objects directly. In this paper, we further explore the design space of text-to-3D models. We significantly improve multi-view generation by considering video instead of image generators. Combined with a 3D reconstruction algorithm which, by using Gaussian splatting, can optimize a robust image-based loss, we directly produce high-quality 3D outputs from the generated views. Our new method, IM-3D, reduces the number of evaluations of the 2D generator network 10-100x, resulting in a much more efficient pipeline, better quality, fewer geometric inconsistencies, and higher yield of usable 3D assets.

Summary

AI-Generated Summary

PDF141December 15, 2024