PLA4D:用於文本到4D高斯塗抹的像素級對齊
PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting
May 30, 2024
作者: Qiaowei Miao, Yawei Luo, Yi Yang
cs.AI
摘要
隨著以文字為條件的擴散模型(DMs)在圖像、視頻和3D生成方面取得突破,研究界的焦點已轉向更具挑戰性的任務,即文字到4D合成,這引入了時間維度以生成動態3D物體。在這個背景下,我們確定了得分蒸餾取樣(SDS),這是一種廣泛應用於文字到3D合成的技術,由於其具有雙面性和不真實紋理問題,以及高計算成本,成為限制文字到4D性能的重要障礙。在本文中,我們提出了用於文字到4D高斯飛濺(PLA4D)的像素級對齊,這是一種新穎方法,利用文字到視頻幀作為明確的像素對齊目標,生成靜態3D物體並將運動注入其中。具體來說,我們引入了焦點對齊來校準用於渲染的相機姿勢,並引入了GS-Mesh對比學習,以從像素級別的渲染圖像對比中提煉幾何先驗。此外,我們開發了運動對齊,使用變形網絡來驅動高斯變化,並實現參考細化,以獲得平滑的4D物體表面。這些技術使4D高斯飛濺能夠在像素級別上將幾何、紋理和運動與生成的視頻對齊。與以前的方法相比,PLA4D在更短的時間內產生了具有更好紋理細節的合成輸出,並有效地緩解了雙面問題。PLA4D完全使用開源模型實現,為4D數字內容創作提供了一個可訪問、用戶友好且有前途的方向。我們的項目頁面:https://github.com/MiaoQiaowei/PLA4D.github.io。
English
As text-conditioned diffusion models (DMs) achieve breakthroughs in image,
video, and 3D generation, the research community's focus has shifted to the
more challenging task of text-to-4D synthesis, which introduces a temporal
dimension to generate dynamic 3D objects. In this context, we identify Score
Distillation Sampling (SDS), a widely used technique for text-to-3D synthesis,
as a significant hindrance to text-to-4D performance due to its Janus-faced and
texture-unrealistic problems coupled with high computational costs. In this
paper, we propose Pixel-Level Alignments for
Text-to-4D Gaussian Splatting (PLA4D), a novel method that
utilizes text-to-video frames as explicit pixel alignment targets to generate
static 3D objects and inject motion into them. Specifically, we introduce Focal
Alignment to calibrate camera poses for rendering and GS-Mesh Contrastive
Learning to distill geometry priors from rendered image contrasts at the pixel
level. Additionally, we develop Motion Alignment using a deformation network to
drive changes in Gaussians and implement Reference Refinement for smooth 4D
object surfaces. These techniques enable 4D Gaussian Splatting to align
geometry, texture, and motion with generated videos at the pixel level.
Compared to previous methods, PLA4D produces synthesized outputs with better
texture details in less time and effectively mitigates the Janus-faced problem.
PLA4D is fully implemented using open-source models, offering an accessible,
user-friendly, and promising direction for 4D digital content creation. Our
project page:
https://github.com/MiaoQiaowei/PLA4D.github.io{https://github.com/MiaoQiaowei/PLA4D.github.io}.Summary
AI-Generated Summary