ChatPaper.aiChatPaper

SyncDreamer:從單視角圖像生成多視角一致的圖像

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

September 7, 2023
作者: Yuan Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, Wenping Wang
cs.AI

摘要

在本文中,我們提出了一種新穎的擴散模型,稱為 SyncDreamer,可以從單視圖圖像生成多視圖一致的圖像。利用預訓練的大規模 2D 擴散模型,最近的 Zero123 工作展示了從物體的單視圖圖像生成合理的新視圖的能力。然而,對於生成的圖像保持幾何和顏色一致性仍然是一個挑戰。為了解決這個問題,我們提出了一個同步多視圖擴散模型,該模型建模了多視圖圖像的聯合概率分佈,從而實現在單個反向過程中生成多視圖一致的圖像。SyncDreamer 通過一個 3D 感知特徵注意機制,在反向過程的每一步同步所有生成圖像的中間狀態,相關聯不同視圖之間的對應特徵。實驗表明,SyncDreamer 生成的圖像在不同視圖之間具有高度一致性,因此非常適用於各種 3D 生成任務,如新視圖合成、文本到 3D 和圖像到 3D。
English
In this paper, we present a novel diffusion model called that generates multiview-consistent images from a single-view image. Using pretrained large-scale 2D diffusion models, recent work Zero123 demonstrates the ability to generate plausible novel views from a single-view image of an object. However, maintaining consistency in geometry and colors for the generated images remains a challenge. To address this issue, we propose a synchronized multiview diffusion model that models the joint probability distribution of multiview images, enabling the generation of multiview-consistent images in a single reverse process. SyncDreamer synchronizes the intermediate states of all the generated images at every step of the reverse process through a 3D-aware feature attention mechanism that correlates the corresponding features across different views. Experiments show that SyncDreamer generates images with high consistency across different views, thus making it well-suited for various 3D generation tasks such as novel-view-synthesis, text-to-3D, and image-to-3D.
PDF135December 15, 2024