SyncDreamer: 단일 뷰 이미지에서 다중 뷰 일관성 이미지 생성

초록

본 논문에서는 단일 뷰 이미지로부터 다중 뷰 일관성 이미지를 생성하는 새로운 확산 모델을 제안합니다. 대규모 2D 확산 모델을 사전 학습하여 사용한 최근 연구인 Zero123은 단일 뷰 이미지에서 그럴듯한 새로운 뷰를 생성할 수 있음을 보여주었습니다. 그러나 생성된 이미지의 기하학적 구조와 색상 일관성을 유지하는 것은 여전히 과제로 남아 있습니다. 이 문제를 해결하기 위해, 우리는 다중 뷰 이미지의 결합 확률 분포를 모델링하여 단일 역과정에서 다중 뷰 일관성 이미지를 생성할 수 있는 동기화된 다중 뷰 확산 모델을 제안합니다. SyncDreamer는 3D 인지 기능 주의 메커니즘을 통해 역과정의 각 단계에서 생성된 모든 이미지의 중간 상태를 동기화하며, 이를 통해 서로 다른 뷰 간의 해당 특징들을 상호 연관시킵니다. 실험 결과, SyncDreamer는 서로 다른 뷰 간에 높은 일관성을 가진 이미지를 생성하며, 이는 새로운 뷰 합성, 텍스트-투-3D, 이미지-투-3D와 같은 다양한 3D 생성 작업에 적합함을 보여줍니다.

English

In this paper, we present a novel diffusion model called that generates multiview-consistent images from a single-view image. Using pretrained large-scale 2D diffusion models, recent work Zero123 demonstrates the ability to generate plausible novel views from a single-view image of an object. However, maintaining consistency in geometry and colors for the generated images remains a challenge. To address this issue, we propose a synchronized multiview diffusion model that models the joint probability distribution of multiview images, enabling the generation of multiview-consistent images in a single reverse process. SyncDreamer synchronizes the intermediate states of all the generated images at every step of the reverse process through a 3D-aware feature attention mechanism that correlates the corresponding features across different views. Experiments show that SyncDreamer generates images with high consistency across different views, thus making it well-suited for various 3D generation tasks such as novel-view-synthesis, text-to-3D, and image-to-3D.

SyncDreamer: 단일 뷰 이미지에서 다중 뷰 일관성 이미지 생성

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

초록

Support