一個物件值得 64x64 像素:透過影像擴散生成 3D 物件
An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion
August 6, 2024
作者: Xingguang Yan, Han-Hung Lee, Ziyu Wan, Angel X. Chang
cs.AI
摘要
我們提出了一種新方法,通過一種稱為「物件影像」的表示來生成具有 UV 地圖的逼真 3D 模型。這種方法將表面幾何、外觀和補丁結構封裝在一個 64x64 像素的影像中,有效地將複雜的 3D 形狀轉換為更易處理的 2D 格式。通過這樣做,我們解決了多邊形網格中固有的幾何和語義不規則性所帶來的挑戰。這種方法使我們能夠直接將像擴散變壓器這樣的影像生成模型應用於 3D 形狀生成。在 ABO 數據集上評估時,我們生成的具有補丁結構的形狀實現了與最近的 3D 生成模型相當的點雲 FID,同時自然支持 PBR 材質生成。
English
We introduce a new approach for generating realistic 3D models with UV maps
through a representation termed "Object Images." This approach encapsulates
surface geometry, appearance, and patch structures within a 64x64 pixel image,
effectively converting complex 3D shapes into a more manageable 2D format. By
doing so, we address the challenges of both geometric and semantic irregularity
inherent in polygonal meshes. This method allows us to use image generation
models, such as Diffusion Transformers, directly for 3D shape generation.
Evaluated on the ABO dataset, our generated shapes with patch structures
achieve point cloud FID comparable to recent 3D generative models, while
naturally supporting PBR material generation.Summary
AI-Generated Summary