ChatPaper.aiChatPaper

Dreamer XL:通過軌跡得分匹配朝向高分辨率文本到3D生成

Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

May 18, 2024
作者: Xingyu Miao, Haoran Duan, Varun Ojha, Jun Song, Tejal Shah, Yang Long, Rajiv Ranjan
cs.AI

摘要

在這份工作中,我們提出了一種新穎的軌跡分數匹配(TSM)方法,旨在解決使用去噪擴散隱式模型(DDIM)反演過程時,由於區間分數匹配(ISM)中的累積誤差而引起的虛假地面真實性不一致問題。與ISM採用DDIM的反演過程在單一路徑上進行計算不同,我們的TSM方法利用DDIM的反演過程從同一起點生成兩條路徑進行計算。由於兩條路徑均起於同一起點,TSM相較於ISM能夠減少累積誤差,從而緩解虛假地面真實性不一致的問題。TSM增強了模型在蒸餾過程中生成路徑的穩定性和一致性。我們通過實驗證明了這一點,並進一步表明ISM是TSM的一種特殊情況。此外,為了優化從高分辨率文本到3D生成的當前多階段優化過程,我們採用了穩定擴散XL進行引導。為應對在使用穩定擴散XL時,由於3D高斯擴散過程中不穩定梯度而引起的異常複製和分裂問題,我們提出了一種逐像素梯度截斷方法。大量實驗表明,我們的模型在視覺質量和性能方面顯著優於最先進的模型。程式碼:https://github.com/xingy038/Dreamer-XL。
English
In this work, we propose a novel Trajectory Score Matching (TSM) method that aims to solve the pseudo ground truth inconsistency problem caused by the accumulated error in Interval Score Matching (ISM) when using the Denoising Diffusion Implicit Models (DDIM) inversion process. Unlike ISM which adopts the inversion process of DDIM to calculate on a single path, our TSM method leverages the inversion process of DDIM to generate two paths from the same starting point for calculation. Since both paths start from the same starting point, TSM can reduce the accumulated error compared to ISM, thus alleviating the problem of pseudo ground truth inconsistency. TSM enhances the stability and consistency of the model's generated paths during the distillation process. We demonstrate this experimentally and further show that ISM is a special case of TSM. Furthermore, to optimize the current multi-stage optimization process from high-resolution text to 3D generation, we adopt Stable Diffusion XL for guidance. In response to the issues of abnormal replication and splitting caused by unstable gradients during the 3D Gaussian splatting process when using Stable Diffusion XL, we propose a pixel-by-pixel gradient clipping method. Extensive experiments show that our model significantly surpasses the state-of-the-art models in terms of visual quality and performance. Code: https://github.com/xingy038/Dreamer-XL.

Summary

AI-Generated Summary

PDF170December 15, 2024