调整您的高斯函数:使用动态3D高斯函数和组合扩散模型的文本到4D
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
December 21, 2023
作者: Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, Karsten Kreis
cs.AI
摘要
文本引导扩散模型已经彻底改变了图像和视频生成,并且已成功应用于基于优化的3D物体合成。在这里,我们转而关注未充分探索的文本到4D设置,并使用评分蒸馏方法合成动态、动画的3D物体,其中增加了一个时间维度。与先前的工作相比,我们采用了一种新颖的基于组合生成的方法,结合文本到图像、文本到视频和3D感知多视角扩散模型,在4D物体优化过程中提供反馈,从而同时强化时间一致性、高质量的视觉外观和真实几何形状。我们的方法,称为"对齐您的高斯"(AYG),利用动态3D高斯飞溅与变形场作为4D表示。AYG的关键在于一种新颖的方法,用于正则化移动的3D高斯分布,从而稳定优化并引发运动。我们还提出了一种运动放大机制,以及一种新的自回归合成方案,用于生成和组合多个4D序列,以实现更长时间的生成。这些技术使我们能够合成生动的动态场景,从质量和数量上优于先前的工作,并实现了最先进的文本到4D性能。由于高斯4D表示,不同的4D动画可以无缝组合,正如我们展示的那样。AYG为动画、模拟和数字内容创作以及合成数据生成开辟了有前途的途径。
English
Text-guided diffusion models have revolutionized image and video generation
and have also been successfully used for optimization-based 3D object
synthesis. Here, we instead focus on the underexplored text-to-4D setting and
synthesize dynamic, animated 3D objects using score distillation methods with
an additional temporal dimension. Compared to previous work, we pursue a novel
compositional generation-based approach, and combine text-to-image,
text-to-video, and 3D-aware multiview diffusion models to provide feedback
during 4D object optimization, thereby simultaneously enforcing temporal
consistency, high-quality visual appearance and realistic geometry. Our method,
called Align Your Gaussians (AYG), leverages dynamic 3D Gaussian Splatting with
deformation fields as 4D representation. Crucial to AYG is a novel method to
regularize the distribution of the moving 3D Gaussians and thereby stabilize
the optimization and induce motion. We also propose a motion amplification
mechanism as well as a new autoregressive synthesis scheme to generate and
combine multiple 4D sequences for longer generation. These techniques allow us
to synthesize vivid dynamic scenes, outperform previous work qualitatively and
quantitatively and achieve state-of-the-art text-to-4D performance. Due to the
Gaussian 4D representation, different 4D animations can be seamlessly combined,
as we demonstrate. AYG opens up promising avenues for animation, simulation and
digital content creation as well as synthetic data generation.