Vidu4D:使用動態高斯Surfels將單個生成的視頻重建為高保真度4D模型
Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels
May 27, 2024
作者: Yikai Wang, Xinzhou Wang, Zilong Chen, Zhengyi Wang, Fuchun Sun, Jun Zhu
cs.AI
摘要
視頻生成模型因其能夠生成逼真且富有想像力的幀而受到特別關注。此外,這些模型還表現出強大的三維一致性,顯著增強了它們作為世界模擬器的潛力。在這項工作中,我們提出了Vidu4D,一種優秀的重建模型,能夠準確地從單個生成的視頻中重建4D(即連續的三維)表示,解決了與非剛性和幀失真相關的挑戰。這種能力對於創建保持空間和時間一致性的高保真虛擬內容至關重要。Vidu4D的核心是我們提出的動態高斯曲面元(DGS)技術。DGS優化了時間變化的變形函數,將高斯曲面元(表面元素)從靜態狀態轉換為動態變形狀態。這種轉換使得能夠準確描述隨時間變化的運動和變形。為了保持與表面對齊的高斯曲面元的結構完整性,我們基於連續變形場設計了用於估計法線的變形狀態幾何正則化。此外,我們學習了高斯曲面元的旋轉和縮放參數的改進,大大減輕了變形過程中的紋理閃爍,並增強了對細微外觀細節的捕捉。Vidu4D還包含一種新穎的初始化狀態,為DGS中的變形場提供了適當的起點。將現有的視頻生成模型與Vidu4D結合,整體框架展示了在外觀和幾何上高保真的文本到4D生成。
English
Video generative models are receiving particular attention given their
ability to generate realistic and imaginative frames. Besides, these models are
also observed to exhibit strong 3D consistency, significantly enhancing their
potential to act as world simulators. In this work, we present Vidu4D, a novel
reconstruction model that excels in accurately reconstructing 4D (i.e.,
sequential 3D) representations from single generated videos, addressing
challenges associated with non-rigidity and frame distortion. This capability
is pivotal for creating high-fidelity virtual contents that maintain both
spatial and temporal coherence. At the core of Vidu4D is our proposed Dynamic
Gaussian Surfels (DGS) technique. DGS optimizes time-varying warping functions
to transform Gaussian surfels (surface elements) from a static state to a
dynamically warped state. This transformation enables a precise depiction of
motion and deformation over time. To preserve the structural integrity of
surface-aligned Gaussian surfels, we design the warped-state geometric
regularization based on continuous warping fields for estimating normals.
Additionally, we learn refinements on rotation and scaling parameters of
Gaussian surfels, which greatly alleviates texture flickering during the
warping process and enhances the capture of fine-grained appearance details.
Vidu4D also contains a novel initialization state that provides a proper start
for the warping fields in DGS. Equipping Vidu4D with an existing video
generative model, the overall framework demonstrates high-fidelity text-to-4D
generation in both appearance and geometry.Summary
AI-Generated Summary