DiffusionGAN3D：結合3D GAN和擴散先驗的文本引導3D生成和領域適應

摘要

基於文本的領域適應和生成3D感知肖像在各個領域中有許多應用。然而，由於缺乏訓練數據以及處理高變異幾何和外觀的挑戰，目前針對這些任務的現有方法存在著靈活性不足、不穩定性和低保真度等問題。在本文中，我們提出了一個新穎的框架DiffusionGAN3D，通過結合3D生成對抗網絡和擴散先驗，來增強基於文本的3D領域適應和生成。具體來說，我們集成了預訓練的3D生成模型（例如EG3D）和文本到圖像擴散模型。前者為從文本生成穩定且高質量的頭像提供了堅實基礎。而擴散模型則提供強大的先驗知識，並引導3D生成器進行微調，以實現靈活且高效的基於文本的領域適應。為了增強領域適應中的多樣性和文本到頭像生成能力，我們引入了相對距離損失和特定案例可學習的三平面。此外，我們設計了一個漸進式紋理精細化模塊，以提高上述兩個任務的紋理質量。大量實驗表明，所提出的框架在領域適應和文本到頭像任務中取得了優異的結果，在生成質量和效率方面優於現有方法。該項目主頁位於https://younglbw.github.io/DiffusionGAN3D-homepage/。

English

Text-guided domain adaption and generation of 3D-aware portraits find many applications in various fields. However, due to the lack of training data and the challenges in handling the high variety of geometry and appearance, the existing methods for these tasks suffer from issues like inflexibility, instability, and low fidelity. In this paper, we propose a novel framework DiffusionGAN3D, which boosts text-guided 3D domain adaption and generation by combining 3D GANs and diffusion priors. Specifically, we integrate the pre-trained 3D generative models (e.g., EG3D) and text-to-image diffusion models. The former provides a strong foundation for stable and high-quality avatar generation from text. And the diffusion models in turn offer powerful priors and guide the 3D generator finetuning with informative direction to achieve flexible and efficient text-guided domain adaption. To enhance the diversity in domain adaption and the generation capability in text-to-avatar, we introduce the relative distance loss and case-specific learnable triplane respectively. Besides, we design a progressive texture refinement module to improve the texture quality for both tasks above. Extensive experiments demonstrate that the proposed framework achieves excellent results in both domain adaption and text-to-avatar tasks, outperforming existing methods in terms of generation quality and efficiency. The project homepage is at https://younglbw.github.io/DiffusionGAN3D-homepage/.

DiffusionGAN3D：結合3D GAN和擴散先驗的文本引導3D生成和領域適應

DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors

摘要

Support