Tailor3D:使用双面图像定制3D资产编辑与生成
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images
July 8, 2024
作者: Zhangyang Qi, Yunhan Yang, Mengchen Zhang, Long Xing, Xiaoyang Wu, Tong Wu, Dahua Lin, Xihui Liu, Jiaqi Wang, Hengshuang Zhao
cs.AI
摘要
最近在3D人工智能生成与创造(AIGC)领域取得的进展显示出直接从文本和图像中创建3D物体的潜力,为动画和产品设计带来了显著的成本节约。然而,对3D资产进行详细编辑和定制仍然是一个长期存在的挑战。具体来说,3D生成方法缺乏像其2D图像创建对应物那样精确地遵循细节指令的能力。想象一下,通过3D AIGC可以获得一个玩具,但配件和服装却不尽如人意。为了解决这一挑战,我们提出了一个名为Tailor3D的新型流程,可以快速从可编辑的双面图像中创建定制的3D资产。我们的目标是模仿裁缝的能力,局部改变物体或进行整体风格转移。与从多个视角创建3D资产不同,使用双面图像消除了在编辑单个视角时发生的重叠区域冲突。具体而言,它首先通过编辑正面视图,然后通过多视角扩散生成物体的背面视图。随后,它继续编辑背面视图。最后,提出了双面LRM,无缝地将正面和背面的3D特征拼接在一起,类似于裁缝缝合服装的正面和背面。双面LRM纠正了正面和背面视图之间的不完美一致性,增强了编辑能力,并减轻了内存负担,同时通过LoRA三平面变压器将它们无缝地整合成统一的3D表示。实验结果展示了Tailor3D在各种3D生成和编辑任务中的有效性,包括3D生成填充和风格转移。它为编辑3D资产提供了用户友好、高效的解决方案,每个编辑步骤仅需几秒钟即可完成。
English
Recent advances in 3D AIGC have shown promise in directly creating 3D objects
from text and images, offering significant cost savings in animation and
product design. However, detailed edit and customization of 3D assets remains a
long-standing challenge. Specifically, 3D Generation methods lack the ability
to follow finely detailed instructions as precisely as their 2D image creation
counterparts. Imagine you can get a toy through 3D AIGC but with undesired
accessories and dressing. To tackle this challenge, we propose a novel pipeline
called Tailor3D, which swiftly creates customized 3D assets from editable
dual-side images. We aim to emulate a tailor's ability to locally change
objects or perform overall style transfer. Unlike creating 3D assets from
multiple views, using dual-side images eliminates conflicts on overlapping
areas that occur when editing individual views. Specifically, it begins by
editing the front view, then generates the back view of the object through
multi-view diffusion. Afterward, it proceeds to edit the back views. Finally, a
Dual-sided LRM is proposed to seamlessly stitch together the front and back 3D
features, akin to a tailor sewing together the front and back of a garment. The
Dual-sided LRM rectifies imperfect consistencies between the front and back
views, enhancing editing capabilities and reducing memory burdens while
seamlessly integrating them into a unified 3D representation with the LoRA
Triplane Transformer. Experimental results demonstrate Tailor3D's effectiveness
across various 3D generation and editing tasks, including 3D generative fill
and style transfer. It provides a user-friendly, efficient solution for editing
3D assets, with each editing step taking only seconds to complete.Summary
AI-Generated Summary