DreamTime：一种改进的文本到3D内容创作优化策略

摘要

最近，预先训练于数十亿图像-文本对的文本到图像扩散模型已通过优化随机初始化的神经辐射场（NeRF）并进行分数蒸馏，实现了文本到三维内容的创建。然而，所得到的三维模型存在两个限制：（a）质量问题，如饱和色彩和雅努斯问题；（b）与文本引导的图像合成相比，多样性极低。本文表明，NeRF优化过程与分数蒸馏中均匀时间步采样之间的冲突是这些限制的主要原因。为解决这一冲突，我们提出优先考虑采样时间步，使用单调非递增函数，使NeRF优化与扩散模型的采样过程对齐。大量实验证明，我们的简单重新设计显著改善了文本到三维内容的创建，提高了质量和多样性。

English

Text-to-image diffusion models pre-trained on billions of image-text pairs have recently enabled text-to-3D content creation by optimizing a randomly initialized Neural Radiance Fields (NeRF) with score distillation. However, the resultant 3D models exhibit two limitations: (a) quality concerns such as saturated color and the Janus problem; (b) extremely low diversity comparing to text-guided image synthesis. In this paper, we show that the conflict between NeRF optimization process and uniform timestep sampling in score distillation is the main reason for these limitations. To resolve this conflict, we propose to prioritize timestep sampling with monotonically non-increasing functions, which aligns NeRF optimization with the sampling process of diffusion model. Extensive experiments show that our simple redesign significantly improves text-to-3D content creation with higher quality and diversity.

DreamTime：一种改进的文本到3D内容创作优化策略

DreamTime: An Improved Optimization Strategy for Text-to-3D Content Creation

摘要

Support