PLA4D:用于文本到4D高斯飞溅的像素级对齐
PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting
May 30, 2024
作者: Qiaowei Miao, Yawei Luo, Yi Yang
cs.AI
摘要
随着文本条件扩散模型(DMs)在图像、视频和3D生成领域取得突破,研究重点已转向更具挑战性的文本到4D合成任务,这引入了时间维度以生成动态3D对象。在这一背景下,我们确定了得分蒸馏采样(SDS)这一广泛使用的技术,用于文本到3D合成,由于其具有两面性和纹理不真实问题,再加上高计算成本,成为限制文本到4D性能的重要障碍。在本文中,我们提出了用于文本到4D高斯飞溅(PLA4D)的像素级对齐方法,这是一种新颖方法,利用文本到视频帧作为显式像素对齐目标,以生成静态3D对象并为其注入运动。具体来说,我们引入了焦点对齐来校准渲染的摄像机姿势,以及GS-Mesh对比学习来从渲染图像对比中提炼几何先验信息。此外,我们利用变形网络开发了运动对齐,以驱动高斯变化,并实现了参考细化,以获得平滑的4D对象表面。这些技术使4D高斯飞溅能够在像素级别与生成的视频对齐几何、纹理和运动。与以往方法相比,PLA4D在更短的时间内产生了具有更好纹理细节的合成输出,并有效地缓解了两面性问题。PLA4D完全采用开源模型实现,为4D数字内容创作提供了一种易于访问、用户友好且具有前景的方向。我们的项目页面:https://github.com/MiaoQiaowei/PLA4D.github.io。
English
As text-conditioned diffusion models (DMs) achieve breakthroughs in image,
video, and 3D generation, the research community's focus has shifted to the
more challenging task of text-to-4D synthesis, which introduces a temporal
dimension to generate dynamic 3D objects. In this context, we identify Score
Distillation Sampling (SDS), a widely used technique for text-to-3D synthesis,
as a significant hindrance to text-to-4D performance due to its Janus-faced and
texture-unrealistic problems coupled with high computational costs. In this
paper, we propose Pixel-Level Alignments for
Text-to-4D Gaussian Splatting (PLA4D), a novel method that
utilizes text-to-video frames as explicit pixel alignment targets to generate
static 3D objects and inject motion into them. Specifically, we introduce Focal
Alignment to calibrate camera poses for rendering and GS-Mesh Contrastive
Learning to distill geometry priors from rendered image contrasts at the pixel
level. Additionally, we develop Motion Alignment using a deformation network to
drive changes in Gaussians and implement Reference Refinement for smooth 4D
object surfaces. These techniques enable 4D Gaussian Splatting to align
geometry, texture, and motion with generated videos at the pixel level.
Compared to previous methods, PLA4D produces synthesized outputs with better
texture details in less time and effectively mitigates the Janus-faced problem.
PLA4D is fully implemented using open-source models, offering an accessible,
user-friendly, and promising direction for 4D digital content creation. Our
project page:
https://github.com/MiaoQiaowei/PLA4D.github.io{https://github.com/MiaoQiaowei/PLA4D.github.io}.Summary
AI-Generated Summary