你所看到的就是你生成对抗网络(GAN):为高保真度的3D GAN中的几何图形渲染每个像素
What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs
January 4, 2024
作者: Alex Trevithick, Matthew Chan, Towaki Takikawa, Umar Iqbal, Shalini De Mello, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano
cs.AI
摘要
三维感知生成对抗网络(GANs)在通过神经体积渲染从二维图像集合中学习生成多视角一致图像和场景的三维几何形状方面取得了显著进展。然而,在体积渲染中密集采样所带来的显著内存和计算成本迫使三维GANs采用基于块的训练或采用低分辨率渲染并进行后处理的二维超分辨率,这牺牲了多视角一致性和解析几何形状的质量。因此,三维GANs尚未能够完全解析二维图像中存在的丰富三维几何形状。在本研究中,我们提出了技术来将神经体积渲染扩展到本地二维图像的更高分辨率,从而以前所未有的细节解析精细的三维几何形状。我们的方法采用基于学习的采样器来加速三维GAN训练的神经渲染,使用更少深度采样高达5倍。这使我们能够在训练和推断期间显式“渲染每个像素”完整分辨率图像,而无需在二维进行后处理超分辨率。结合我们学习高质量表面几何的策略,我们的方法合成高分辨率三维几何和严格一致的图像,同时保持与依赖后处理超分辨率的基准相媲美的图像质量。我们在FFHQ和AFHQ上展示了最先进的三维几何质量,为三维GANs中无监督学习三维形状设定了新标准。
English
3D-aware Generative Adversarial Networks (GANs) have shown remarkable
progress in learning to generate multi-view-consistent images and 3D geometries
of scenes from collections of 2D images via neural volume rendering. Yet, the
significant memory and computational costs of dense sampling in volume
rendering have forced 3D GANs to adopt patch-based training or employ
low-resolution rendering with post-processing 2D super resolution, which
sacrifices multiview consistency and the quality of resolved geometry.
Consequently, 3D GANs have not yet been able to fully resolve the rich 3D
geometry present in 2D images. In this work, we propose techniques to scale
neural volume rendering to the much higher resolution of native 2D images,
thereby resolving fine-grained 3D geometry with unprecedented detail. Our
approach employs learning-based samplers for accelerating neural rendering for
3D GAN training using up to 5 times fewer depth samples. This enables us to
explicitly "render every pixel" of the full-resolution image during training
and inference without post-processing superresolution in 2D. Together with our
strategy to learn high-quality surface geometry, our method synthesizes
high-resolution 3D geometry and strictly view-consistent images while
maintaining image quality on par with baselines relying on post-processing
super resolution. We demonstrate state-of-the-art 3D gemetric quality on FFHQ
and AFHQ, setting a new standard for unsupervised learning of 3D shapes in 3D
GANs.