DreamComposer:透過多視角條件控制的可控3D物體生成
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
December 6, 2023
作者: Yunhan Yang, Yukun Huang, Xiaoyang Wu, Yuan-Chen Guo, Song-Hai Zhang, Hengshuang Zhao, Tong He, Xihui Liu
cs.AI
摘要
利用預先訓練的2D大規模生成模型,最近的研究能夠從單張野外圖像生成高質量的新視角。然而,由於缺乏來自多個視角的信息,這些研究在生成可控新視角時遇到困難。本文提出了DreamComposer,一個靈活且可擴展的框架,可以通過注入多視角條件來增強現有的視角感知擴散模型。具體而言,DreamComposer首先使用視角感知的3D提升模組從多個視角獲取物體的3D表示。然後,它使用多視角特徵融合模組從3D表示中呈現目標視角的潛在特徵。最後,從多視角輸入中提取的目標視角特徵被注入到預先訓練的擴散模型中。實驗表明,DreamComposer與最先進的擴散模型相容,用於零樣本新視角合成,進一步增強其生成具有多視角條件的高保真新視角圖像,以便進行可控的3D物體重建和其他各種應用。
English
Utilizing pre-trained 2D large-scale generative models, recent works are
capable of generating high-quality novel views from a single in-the-wild image.
However, due to the lack of information from multiple views, these works
encounter difficulties in generating controllable novel views. In this paper,
we present DreamComposer, a flexible and scalable framework that can enhance
existing view-aware diffusion models by injecting multi-view conditions.
Specifically, DreamComposer first uses a view-aware 3D lifting module to obtain
3D representations of an object from multiple views. Then, it renders the
latent features of the target view from 3D representations with the multi-view
feature fusion module. Finally the target view features extracted from
multi-view inputs are injected into a pre-trained diffusion model. Experiments
show that DreamComposer is compatible with state-of-the-art diffusion models
for zero-shot novel view synthesis, further enhancing them to generate
high-fidelity novel view images with multi-view conditions, ready for
controllable 3D object reconstruction and various other applications.