Tailor3D:使用雙面圖像進行定制化3D資產編輯與生成
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images
July 8, 2024
作者: Zhangyang Qi, Yunhan Yang, Mengchen Zhang, Long Xing, Xiaoyang Wu, Tong Wu, Dahua Lin, Xihui Liu, Jiaqi Wang, Hengshuang Zhao
cs.AI
摘要
最近在3D人工智慧生成與創造方面取得的進展顯示了直接從文字和圖像中創建3D物體的潛力,為動畫和產品設計帶來了顯著的成本節省。然而,對3D資產進行詳細編輯和定制仍然是一個長期存在的挑戰。具體而言,3D生成方法缺乏像其2D圖像創建對應物那樣精確地遵循細節指令的能力。想象一下透過3D人工智慧生成獲得一個玩具,但配件和服飾不符合期望。為應對這一挑戰,我們提出了一個名為Tailor3D的新型流程,可以迅速從可編輯的雙面圖像中創建定制的3D資產。我們旨在模擬裁縫的能力,局部改變物體或進行整體風格轉移。與從多個視角創建3D資產不同,使用雙面圖像消除了在編輯單個視角時出現的重疊區域衝突。具體而言,它首先編輯正面視圖,然後通過多視圖擴散生成物體的背面視圖。之後,它繼續編輯背面視圖。最後,提出了一種雙面LRM,無縫地將正面和背面3D特徵拼接在一起,就像裁縫將服裝的正面和背面縫在一起一樣。雙面LRM糾正了正面和背面視圖之間的不完美一致性,增強了編輯能力,減輕了記憶負擔,同時通過LoRA三平面變壓器將它們無縫地集成到統一的3D表示中。實驗結果顯示Tailor3D在各種3D生成和編輯任務中的有效性,包括3D生成填充和風格轉移。它為編輯3D資產提供了一個用戶友好、高效的解決方案,每個編輯步驟僅需幾秒鐘即可完成。
English
Recent advances in 3D AIGC have shown promise in directly creating 3D objects
from text and images, offering significant cost savings in animation and
product design. However, detailed edit and customization of 3D assets remains a
long-standing challenge. Specifically, 3D Generation methods lack the ability
to follow finely detailed instructions as precisely as their 2D image creation
counterparts. Imagine you can get a toy through 3D AIGC but with undesired
accessories and dressing. To tackle this challenge, we propose a novel pipeline
called Tailor3D, which swiftly creates customized 3D assets from editable
dual-side images. We aim to emulate a tailor's ability to locally change
objects or perform overall style transfer. Unlike creating 3D assets from
multiple views, using dual-side images eliminates conflicts on overlapping
areas that occur when editing individual views. Specifically, it begins by
editing the front view, then generates the back view of the object through
multi-view diffusion. Afterward, it proceeds to edit the back views. Finally, a
Dual-sided LRM is proposed to seamlessly stitch together the front and back 3D
features, akin to a tailor sewing together the front and back of a garment. The
Dual-sided LRM rectifies imperfect consistencies between the front and back
views, enhancing editing capabilities and reducing memory burdens while
seamlessly integrating them into a unified 3D representation with the LoRA
Triplane Transformer. Experimental results demonstrate Tailor3D's effectiveness
across various 3D generation and editing tasks, including 3D generative fill
and style transfer. It provides a user-friendly, efficient solution for editing
3D assets, with each editing step taking only seconds to complete.Summary
AI-Generated Summary