ICON: 관절 포즈 및 방사장 최적화를 위한 점진적 신뢰도

초록

Neural Radiance Fields(NeRF)는 2D 이미지 집합이 주어졌을 때 새로운 시점 합성(Novel View Synthesis, NVS)에서 뛰어난 성능을 보여준다. 그러나 NeRF 학습은 일반적으로 Structure-from-Motion(SfM) 파이프라인을 통해 얻은 각 입력 뷰에 대한 정확한 카메라 포즈를 필요로 한다. 최근 연구들은 이러한 제약을 완화하려는 시도를 해왔지만, 여전히 개선할 수 있는 적절한 초기 포즈에 의존하는 경우가 많다. 본 연구에서는 포즈 초기화 요구 사항을 제거하는 것을 목표로 한다. 우리는 2D 비디오 프레임으로부터 NeRF를 학습하기 위한 최적화 절차인 Incremental CONfidence(ICON)를 제안한다. ICON은 초기 포즈 추정을 위해 카메라 운동이 부드럽다는 가정만을 전제로 한다. 또한, ICON은 모델 품질의 적응형 측정치인 "신뢰도"를 도입하여 동적으로 그래디언트를 재가중한다. ICON은 NeRF를 학습하기 위해 높은 신뢰도의 포즈에 의존하며, 포즈를 학습하기 위해 NeRF에 의해 인코딩된 높은 신뢰도의 3D 구조에 의존한다. 우리는 ICON이 사전 포즈 초기화 없이도 CO3D와 HO3D에서 SfM 포즈를 사용하는 방법들보다 우수한 성능을 달성함을 보여준다.

English

Neural Radiance Fields (NeRF) exhibit remarkable performance for Novel View Synthesis (NVS) given a set of 2D images. However, NeRF training requires accurate camera pose for each input view, typically obtained by Structure-from-Motion (SfM) pipelines. Recent works have attempted to relax this constraint, but they still often rely on decent initial poses which they can refine. Here we aim at removing the requirement for pose initialization. We present Incremental CONfidence (ICON), an optimization procedure for training NeRFs from 2D video frames. ICON only assumes smooth camera motion to estimate initial guess for poses. Further, ICON introduces ``confidence": an adaptive measure of model quality used to dynamically reweight gradients. ICON relies on high-confidence poses to learn NeRF, and high-confidence 3D structure (as encoded by NeRF) to learn poses. We show that ICON, without prior pose initialization, achieves superior performance in both CO3D and HO3D versus methods which use SfM pose.

ICON: 관절 포즈 및 방사장 최적화를 위한 점진적 신뢰도

ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization

초록

Support