DreamComposer：通过多视角条件实现可控的三维物体生成

摘要

利用预训练的2D大规模生成模型，最近的研究能够从单个野外图像生成高质量的新视图。然而，由于缺乏来自多个视角的信息，这些研究在生成可控新视图时遇到困难。在本文中，我们提出了DreamComposer，这是一个灵活且可扩展的框架，可以通过注入多视角条件来增强现有的视角感知扩散模型。具体而言，DreamComposer首先使用视角感知的3D提升模块从多个视角获取物体的3D表示。然后，它使用多视角特征融合模块从3D表示中渲染目标视图的潜在特征。最后，从多视角输入中提取的目标视图特征被注入到预训练的扩散模型中。实验证明，DreamComposer与最先进的扩散模型兼容，用于零样本新视图合成，进一步增强了它们以生成具有多视角条件的高保真新视图图像，可用于可控的3D物体重建和其他各种应用。

English

Utilizing pre-trained 2D large-scale generative models, recent works are capable of generating high-quality novel views from a single in-the-wild image. However, due to the lack of information from multiple views, these works encounter difficulties in generating controllable novel views. In this paper, we present DreamComposer, a flexible and scalable framework that can enhance existing view-aware diffusion models by injecting multi-view conditions. Specifically, DreamComposer first uses a view-aware 3D lifting module to obtain 3D representations of an object from multiple views. Then, it renders the latent features of the target view from 3D representations with the multi-view feature fusion module. Finally the target view features extracted from multi-view inputs are injected into a pre-trained diffusion model. Experiments show that DreamComposer is compatible with state-of-the-art diffusion models for zero-shot novel view synthesis, further enhancing them to generate high-fidelity novel view images with multi-view conditions, ready for controllable 3D object reconstruction and various other applications.

DreamComposer：通过多视角条件实现可控的三维物体生成

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions

摘要

Support