ChatPaper.aiChatPaper

DreamComposer:通过多视角条件实现可控的三维物体生成

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions

December 6, 2023
作者: Yunhan Yang, Yukun Huang, Xiaoyang Wu, Yuan-Chen Guo, Song-Hai Zhang, Hengshuang Zhao, Tong He, Xihui Liu
cs.AI

摘要

利用预训练的2D大规模生成模型,最近的研究能够从单个野外图像生成高质量的新视图。然而,由于缺乏来自多个视角的信息,这些研究在生成可控新视图时遇到困难。在本文中,我们提出了DreamComposer,这是一个灵活且可扩展的框架,可以通过注入多视角条件来增强现有的视角感知扩散模型。具体而言,DreamComposer首先使用视角感知的3D提升模块从多个视角获取物体的3D表示。然后,它使用多视角特征融合模块从3D表示中渲染目标视图的潜在特征。最后,从多视角输入中提取的目标视图特征被注入到预训练的扩散模型中。实验证明,DreamComposer与最先进的扩散模型兼容,用于零样本新视图合成,进一步增强了它们以生成具有多视角条件的高保真新视图图像,可用于可控的3D物体重建和其他各种应用。
English
Utilizing pre-trained 2D large-scale generative models, recent works are capable of generating high-quality novel views from a single in-the-wild image. However, due to the lack of information from multiple views, these works encounter difficulties in generating controllable novel views. In this paper, we present DreamComposer, a flexible and scalable framework that can enhance existing view-aware diffusion models by injecting multi-view conditions. Specifically, DreamComposer first uses a view-aware 3D lifting module to obtain 3D representations of an object from multiple views. Then, it renders the latent features of the target view from 3D representations with the multi-view feature fusion module. Finally the target view features extracted from multi-view inputs are injected into a pre-trained diffusion model. Experiments show that DreamComposer is compatible with state-of-the-art diffusion models for zero-shot novel view synthesis, further enhancing them to generate high-fidelity novel view images with multi-view conditions, ready for controllable 3D object reconstruction and various other applications.
PDF90December 15, 2024