ChatPaper.aiChatPaper

使用受控多視角編輯的通用3D擴散適配器

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

March 18, 2024
作者: Hansheng Chen, Ruoxi Shi, Yulin Liu, Bokui Shen, Jiayuan Gu, Gordon Wetzstein, Hao Su, Leonidas Guibas
cs.AI

摘要

開放域3D物體合成相較於影像合成進展緩慢,原因在於資料有限以及較高的計算複雜度。為彌補這一差距,最近的研究作品探討了多視角擴散,但往往在3D一致性、視覺品質或效率方面表現不佳。本文提出了MVEdit,作為SDEdit的3D對應版本,採用祖先取樣來同時去噪多視角影像並輸出高質量紋理網格。基於現成的2D擴散模型,MVEdit通過訓練免費的3D適配器實現3D一致性,該適配器將最後時間步的2D視圖提升為一致的3D表示,然後使用渲染視圖來條件化下一時間步的2D視圖,同時不影響視覺品質。該框架的推理時間僅為2-5分鐘,比得分蒸餾實現了更好的品質和速度之間的折衷。MVEdit非常靈活且可擴展,具有廣泛的應用,包括文本/圖像到3D生成、3D到3D編輯以及高質量紋理合成。特別是,評估顯示在影像到3D和文本引導的紋理生成任務中表現出最先進的性能。此外,我們介紹了一種方法,可以在資源有限的情況下對小型3D數據集上的2D潛在擴散模型進行微調,實現快速低分辨率文本到3D的初始化。
English
Open-domain 3D object synthesis has been lagging behind image synthesis due to limited data and higher computational complexity. To bridge this gap, recent works have investigated multi-view diffusion but often fall short in either 3D consistency, visual quality, or efficiency. This paper proposes MVEdit, which functions as a 3D counterpart of SDEdit, employing ancestral sampling to jointly denoise multi-view images and output high-quality textured meshes. Built on off-the-shelf 2D diffusion models, MVEdit achieves 3D consistency through a training-free 3D Adapter, which lifts the 2D views of the last timestep into a coherent 3D representation, then conditions the 2D views of the next timestep using rendered views, without uncompromising visual quality. With an inference time of only 2-5 minutes, this framework achieves better trade-off between quality and speed than score distillation. MVEdit is highly versatile and extendable, with a wide range of applications including text/image-to-3D generation, 3D-to-3D editing, and high-quality texture synthesis. In particular, evaluations demonstrate state-of-the-art performance in both image-to-3D and text-guided texture generation tasks. Additionally, we introduce a method for fine-tuning 2D latent diffusion models on small 3D datasets with limited resources, enabling fast low-resolution text-to-3D initialization.

Summary

AI-Generated Summary

PDF152December 15, 2024