ChatPaper.aiChatPaper

Dreamer XL:通过轨迹分数匹配实现高分辨率文本到3D生成

Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

May 18, 2024
作者: Xingyu Miao, Haoran Duan, Varun Ojha, Jun Song, Tejal Shah, Yang Long, Rajiv Ranjan
cs.AI

摘要

在这项工作中,我们提出了一种新颖的轨迹分数匹配(TSM)方法,旨在解决使用去噪扩散隐式模型(DDIM)反演过程时,由于区间分数匹配(ISM)中累积误差导致的伪地面真实性不一致的问题。与ISM不同,ISM采用DDIM的反演过程在单一路径上进行计算,而我们的TSM方法利用DDIM的反演过程从同一起点生成两条路径进行计算。由于两条路径均起始于同一起点,TSM相较于ISM可以减少累积误差,从而缓解伪地面真实性不一致的问题。TSM增强了模型在蒸馏过程中生成路径的稳定性和一致性。我们通过实验证明了这一点,并进一步表明ISM是TSM的一个特例。此外,为了优化从高分辨率文本到3D生成的当前多阶段优化过程,我们采用了稳定扩散XL进行指导。针对在使用稳定扩散XL时由于3D高斯飞溅过程中不稳定梯度导致的异常复制和分裂问题,我们提出了一种逐像素梯度剪切方法。大量实验证明,我们的模型在视觉质量和性能方面显著超越了现有模型。源代码:https://github.com/xingy038/Dreamer-XL。
English
In this work, we propose a novel Trajectory Score Matching (TSM) method that aims to solve the pseudo ground truth inconsistency problem caused by the accumulated error in Interval Score Matching (ISM) when using the Denoising Diffusion Implicit Models (DDIM) inversion process. Unlike ISM which adopts the inversion process of DDIM to calculate on a single path, our TSM method leverages the inversion process of DDIM to generate two paths from the same starting point for calculation. Since both paths start from the same starting point, TSM can reduce the accumulated error compared to ISM, thus alleviating the problem of pseudo ground truth inconsistency. TSM enhances the stability and consistency of the model's generated paths during the distillation process. We demonstrate this experimentally and further show that ISM is a special case of TSM. Furthermore, to optimize the current multi-stage optimization process from high-resolution text to 3D generation, we adopt Stable Diffusion XL for guidance. In response to the issues of abnormal replication and splitting caused by unstable gradients during the 3D Gaussian splatting process when using Stable Diffusion XL, we propose a pixel-by-pixel gradient clipping method. Extensive experiments show that our model significantly surpasses the state-of-the-art models in terms of visual quality and performance. Code: https://github.com/xingy038/Dreamer-XL.
PDF170December 15, 2024