你所看到的就是你生成對抗網絡(GAN):為了高保真度的三維幾何形狀渲染每個像素
What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs
January 4, 2024
作者: Alex Trevithick, Matthew Chan, Towaki Takikawa, Umar Iqbal, Shalini De Mello, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano
cs.AI
摘要
3D感知生成對抗網絡(GANs)在通過神經體積渲染從2D圖像集合生成多視角一致圖像和場景的3D幾何形狀方面取得了顯著進展。然而,在體積渲染中密集採樣的顯著內存和計算成本迫使3D GANs採用基於補丁的訓練或採用低分辨率渲染並進行後處理的2D超分辨率,這會犧牲多視角一致性和解析幾何形狀的質量。因此,3D GANs還無法完全解析2D圖像中存在的豐富3D幾何形狀。在這項工作中,我們提出了將神經體積渲染擴展到本地2D圖像更高分辨率的技術,從而以前所未有的細節解析精細的3D幾何形狀。我們的方法採用基於學習的取樣器,加速神經渲染以用於3D GAN訓練,使用多達5倍少的深度採樣。這使我們能夠在訓練和推斷期間明確“渲染每個像素”完整分辨率圖像,而無需在2D進行後處理超分辨率。結合我們學習高質量表面幾何的策略,我們的方法合成高分辨率3D幾何形狀和嚴格一致的圖像,同時保持與依賴後處理超分辨率的基準線相當的圖像質量。我們在FFHQ和AFHQ上展示了最先進的3D幾何質量,為3D GANs中無監督學習3D形狀設定了新標準。
English
3D-aware Generative Adversarial Networks (GANs) have shown remarkable
progress in learning to generate multi-view-consistent images and 3D geometries
of scenes from collections of 2D images via neural volume rendering. Yet, the
significant memory and computational costs of dense sampling in volume
rendering have forced 3D GANs to adopt patch-based training or employ
low-resolution rendering with post-processing 2D super resolution, which
sacrifices multiview consistency and the quality of resolved geometry.
Consequently, 3D GANs have not yet been able to fully resolve the rich 3D
geometry present in 2D images. In this work, we propose techniques to scale
neural volume rendering to the much higher resolution of native 2D images,
thereby resolving fine-grained 3D geometry with unprecedented detail. Our
approach employs learning-based samplers for accelerating neural rendering for
3D GAN training using up to 5 times fewer depth samples. This enables us to
explicitly "render every pixel" of the full-resolution image during training
and inference without post-processing superresolution in 2D. Together with our
strategy to learn high-quality surface geometry, our method synthesizes
high-resolution 3D geometry and strictly view-consistent images while
maintaining image quality on par with baselines relying on post-processing
super resolution. We demonstrate state-of-the-art 3D gemetric quality on FFHQ
and AFHQ, setting a new standard for unsupervised learning of 3D shapes in 3D
GANs.