LagerNVS: 실시간 완전 신경망 기반 새로운 시점 합성을 위한 잠재 기하학

초록

최근 연구에서는 신경망이 명시적 3D 재구성 없이도 새로운 시점 합성(NVS)과 같은 3D 작업을 수행할 수 있음이 입증되었습니다. 그럼에도 불구하고, 우리는 강력한 3D 귀납적 편향이 이러한 네트워크 설계에 여전히 유용하다고 주장합니다. 우리는 '3D 인지' 잠재 특징에 기반한 NVS용 인코더-디코더 신경망인 LagerNVS를 소개하여 이 점을 입증합니다. 인코더는 명시적 3D 지도 학습으로 사전 학습된 3D 재구성 네트워크에서 초기화됩니다. 이는 경량 디코더와 결합되고, 광도 측정 손실을 사용하여 종단간 학습됩니다. LagerNVS는 카메라 파라미터 정보 유무에 관계없이 최첨단 결정론적 순전파 방식의 새로운 시점 합성(Re10k 데이터셋 기준 PSNR 31.4 점 포함)을 달성하며, 실시간 렌더링이 가능하고, 실제 환경 데이터로 일반화가 잘 되며, 생성형 외삽을 위한 확산 디코더와 결합할 수 있습니다.

English

Recent work has shown that neural networks can perform 3D tasks such as Novel View Synthesis (NVS) without explicit 3D reconstruction. Even so, we argue that strong 3D inductive biases are still helpful in the design of such networks. We show this point by introducing LagerNVS, an encoder-decoder neural network for NVS that builds on `3D-aware' latent features. The encoder is initialized from a 3D reconstruction network pre-trained using explicit 3D supervision. This is paired with a lightweight decoder, and trained end-to-end with photometric losses. LagerNVS achieves state-of-the-art deterministic feed-forward Novel View Synthesis (including 31.4 PSNR on Re10k), with and without known cameras, renders in real time, generalizes to in-the-wild data, and can be paired with a diffusion decoder for generative extrapolation.

LagerNVS: 실시간 완전 신경망 기반 새로운 시점 합성을 위한 잠재 기하학

LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis

초록

Support