LU-NeRF: 로컬 언포즈드 NeRF 동기화를 통한 장면 및 포즈 추정

초록

NeRF 모델이 널리 활용되는 데 있어 주요 장애물은 정확한 카메라 포즈에 대한 의존성이다. 이에 따라 카메라 포즈와 장면 표현을 동시에 최적화하는 NeRF 모델을 확장하려는 관심이 증가하고 있으며, 이는 잘 알려진 실패 모드를 가진 기존의 SfM 파이프라인에 대한 대안을 제공한다. 기존의 포즈가 없는 NeRF 접근법은 사전 포즈 분포나 대략적인 포즈 초기화와 같은 제한된 가정 하에서 작동하므로 일반적인 설정에서는 효과적이지 못하다. 본 연구에서는 포즈 구성에 대한 가정을 완화하여 카메라 포즈와 신경 방사 필드를 동시에 추정하는 새로운 접근법인 LU-NeRF를 제안한다. 우리의 접근법은 지역적에서 전역적으로 작동하며, 먼저 데이터의 지역적 부분집합인 '미니 장면'에 대해 최적화를 수행한다. LU-NeRF는 이 어려운 소수 샷 작업에 대해 지역적 포즈와 기하학을 추정한다. 미니 장면의 포즈는 강력한 포즈 동기화 단계를 통해 전역 참조 프레임으로 통합되며, 최종적으로 포즈와 장면에 대한 전역 최적화가 수행된다. 우리는 LU-NeRF 파이프라인이 포즈 사전에 대한 제한적인 가정 없이 기존의 포즈가 없는 NeRF 시도보다 우수한 성능을 보임을 입증한다. 이를 통해 우리는 기준선과 달리 일반적인 SE(3) 포즈 설정에서 작동할 수 있다. 또한, 우리의 모델이 저해상도 및 저질감 이미지에서 COLMAP에 비해 유리하게 비교되므로 특징 기반 SfM 파이프라인과 상호 보완적일 수 있음을 보여준다.

English

A critical obstacle preventing NeRF models from being deployed broadly in the wild is their reliance on accurate camera poses. Consequently, there is growing interest in extending NeRF models to jointly optimize camera poses and scene representation, which offers an alternative to off-the-shelf SfM pipelines which have well-understood failure modes. Existing approaches for unposed NeRF operate under limited assumptions, such as a prior pose distribution or coarse pose initialization, making them less effective in a general setting. In this work, we propose a novel approach, LU-NeRF, that jointly estimates camera poses and neural radiance fields with relaxed assumptions on pose configuration. Our approach operates in a local-to-global manner, where we first optimize over local subsets of the data, dubbed mini-scenes. LU-NeRF estimates local pose and geometry for this challenging few-shot task. The mini-scene poses are brought into a global reference frame through a robust pose synchronization step, where a final global optimization of pose and scene can be performed. We show our LU-NeRF pipeline outperforms prior attempts at unposed NeRF without making restrictive assumptions on the pose prior. This allows us to operate in the general SE(3) pose setting, unlike the baselines. Our results also indicate our model can be complementary to feature-based SfM pipelines as it compares favorably to COLMAP on low-texture and low-resolution images.

LU-NeRF: 로컬 언포즈드 NeRF 동기화를 통한 장면 및 포즈 추정

LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs

초록

Support