ChatPaper.aiChatPaper

SV4D:具有多幀和多視角一致性的動態3D內容生成

SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency

July 24, 2024
作者: Yiming Xie, Chun-Han Yao, Vikram Voleti, Huaizu Jiang, Varun Jampani
cs.AI

摘要

我們提出了穩定影片4D(SV4D),這是一種潛在的影片擴散模型,用於多幀和多視角一致的動態3D內容生成。與先前依賴分別訓練生成模型進行影片生成和新視角合成的方法不同,我們設計了一個統一的擴散模型,用於生成動態3D物體的新視角影片。具體而言,給定一個單眼參考影片,SV4D為每個影片幀生成在時間上一致的新視角。然後,我們使用生成的新視角影片來有效地優化一個隱式的4D表示(動態NeRF),而無需使用大多數先前作品中使用的繁瑣的基於SDS的優化。為了訓練我們的統一的新視角影片生成模型,我們從現有的Objaverse數據集中精心挑選了一個動態3D物體數據集。在多個數據集和用戶研究上進行的廣泛實驗結果顯示,與先前作品相比,SV4D在新視角影片合成和4D生成方面表現出色。
English
We present Stable Video 4D (SV4D), a latent video diffusion model for multi-frame and multi-view consistent dynamic 3D content generation. Unlike previous methods that rely on separately trained generative models for video generation and novel view synthesis, we design a unified diffusion model to generate novel view videos of dynamic 3D objects. Specifically, given a monocular reference video, SV4D generates novel views for each video frame that are temporally consistent. We then use the generated novel view videos to optimize an implicit 4D representation (dynamic NeRF) efficiently, without the need for cumbersome SDS-based optimization used in most prior works. To train our unified novel view video generation model, we curated a dynamic 3D object dataset from the existing Objaverse dataset. Extensive experimental results on multiple datasets and user studies demonstrate SV4D's state-of-the-art performance on novel-view video synthesis as well as 4D generation compared to prior works.

Summary

AI-Generated Summary

PDF162November 28, 2024