DreamComposer:通过多视角条件实现可控的三维物体生成
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
December 6, 2023
作者: Yunhan Yang, Yukun Huang, Xiaoyang Wu, Yuan-Chen Guo, Song-Hai Zhang, Hengshuang Zhao, Tong He, Xihui Liu
cs.AI
摘要
利用预训练的2D大规模生成模型,最近的研究能够从单个野外图像生成高质量的新视图。然而,由于缺乏来自多个视角的信息,这些研究在生成可控新视图时遇到困难。在本文中,我们提出了DreamComposer,这是一个灵活且可扩展的框架,可以通过注入多视角条件来增强现有的视角感知扩散模型。具体而言,DreamComposer首先使用视角感知的3D提升模块从多个视角获取物体的3D表示。然后,它使用多视角特征融合模块从3D表示中渲染目标视图的潜在特征。最后,从多视角输入中提取的目标视图特征被注入到预训练的扩散模型中。实验证明,DreamComposer与最先进的扩散模型兼容,用于零样本新视图合成,进一步增强了它们以生成具有多视角条件的高保真新视图图像,可用于可控的3D物体重建和其他各种应用。
English
Utilizing pre-trained 2D large-scale generative models, recent works are
capable of generating high-quality novel views from a single in-the-wild image.
However, due to the lack of information from multiple views, these works
encounter difficulties in generating controllable novel views. In this paper,
we present DreamComposer, a flexible and scalable framework that can enhance
existing view-aware diffusion models by injecting multi-view conditions.
Specifically, DreamComposer first uses a view-aware 3D lifting module to obtain
3D representations of an object from multiple views. Then, it renders the
latent features of the target view from 3D representations with the multi-view
feature fusion module. Finally the target view features extracted from
multi-view inputs are injected into a pre-trained diffusion model. Experiments
show that DreamComposer is compatible with state-of-the-art diffusion models
for zero-shot novel view synthesis, further enhancing them to generate
high-fidelity novel view images with multi-view conditions, ready for
controllable 3D object reconstruction and various other applications.