Phidias:一种用于从文本、图像和3D条件生成3D内容的生成模型,采用参考增强扩散。
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
September 17, 2024
作者: Zhenwei Wang, Tengfei Wang, Zexin He, Gerhard Hancke, Ziwei Liu, Rynson W. H. Lau
cs.AI
摘要
在3D建模中,设计师经常使用现有的3D模型作为参考来创建新模型。这种做法启发了Phidias的开发,这是一种新颖的生成模型,它使用扩散来进行参考增强的3D生成。在给定一幅图像的情况下,我们的方法利用检索或用户提供的3D参考模型来引导生成过程,从而提升生成质量、泛化能力和可控性。我们的模型集成了三个关键组件:1)元控制网络,动态调节条件强度;2)动态参考路由,减轻输入图像和3D参考之间的不对齐;3)自参增强,实现具有渐进课程的自监督训练。总体而言,这些设计相对于现有方法有明显的改进。Phidias建立了一个统一的框架,用于使用文本、图像和3D条件进行3D生成,具有多种应用。
English
In 3D modeling, designers often use an existing 3D model as a reference to
create new ones. This practice has inspired the development of Phidias, a novel
generative model that uses diffusion for reference-augmented 3D generation.
Given an image, our method leverages a retrieved or user-provided 3D reference
model to guide the generation process, thereby enhancing the generation
quality, generalization ability, and controllability. Our model integrates
three key components: 1) meta-ControlNet that dynamically modulates the
conditioning strength, 2) dynamic reference routing that mitigates misalignment
between the input image and 3D reference, and 3) self-reference augmentations
that enable self-supervised training with a progressive curriculum.
Collectively, these designs result in a clear improvement over existing
methods. Phidias establishes a unified framework for 3D generation using text,
image, and 3D conditions with versatile applications.Summary
AI-Generated Summary