ChatPaper.aiChatPaper

FlexiDreamer:基于FlexiCubes的单张图像生成三维模型

FlexiDreamer: Single Image-to-3D Generation with FlexiCubes

April 1, 2024
作者: Ruowen Zhao, Zhengyi Wang, Yikai Wang, Zihan Zhou, Jun Zhu
cs.AI

摘要

从文本提示或单张图像生成3D内容的质量和速度近期取得了显著进展。其主流范式之一是先生成一致的多视角图像,随后进行稀疏视角重建。然而,由于直接变形网格表示以接近目标拓扑结构的挑战,大多数方法在稀疏视角重建过程中学习隐式表示(如NeRF),并通过后处理提取获得目标网格。尽管隐式表示能有效建模丰富的3D信息,但其训练通常需要较长的收敛时间。此外,从隐式场进行后提取操作也会导致不理想的视觉伪影。本文提出FlexiDreamer,一种新颖的单图转3D生成框架,以端到端方式重建目标网格。通过利用一种名为FlexiCubes的灵活基于梯度的提取方法,我们的方法避免了后处理带来的缺陷,并促进了目标网格的直接获取。此外,我们引入了一种多分辨率哈希网格编码方案,该方案逐步激活FlexiCubes中隐式场的编码层级,以帮助捕捉每一步优化的几何细节。值得注意的是,FlexiDreamer在单张NVIDIA A100 GPU上仅需约1分钟即可从单视角图像恢复密集的3D结构,显著优于以往的方法。
English
3D content generation from text prompts or single images has made remarkable progress in quality and speed recently. One of its dominant paradigms involves generating consistent multi-view images followed by a sparse-view reconstruction. However, due to the challenge of directly deforming the mesh representation to approach the target topology, most methodologies learn an implicit representation (such as NeRF) during the sparse-view reconstruction and acquire the target mesh by a post-processing extraction. Although the implicit representation can effectively model rich 3D information, its training typically entails a long convergence time. In addition, the post-extraction operation from the implicit field also leads to undesirable visual artifacts. In this paper, we propose FlexiDreamer, a novel single image-to-3d generation framework that reconstructs the target mesh in an end-to-end manner. By leveraging a flexible gradient-based extraction known as FlexiCubes, our method circumvents the defects brought by the post-processing and facilitates a direct acquisition of the target mesh. Furthermore, we incorporate a multi-resolution hash grid encoding scheme that progressively activates the encoding levels into the implicit field in FlexiCubes to help capture geometric details for per-step optimization. Notably, FlexiDreamer recovers a dense 3D structure from a single-view image in approximately 1 minute on a single NVIDIA A100 GPU, outperforming previous methodologies by a large margin.

Summary

AI-Generated Summary

PDF242November 26, 2024