4DGS360: 단일 동영상으로부터 동적 객체의 360도 가우시안 복원

초록

저희는 단안성 동영상으로부터 360° 동적 객체 재구성을 위한 확산 모델 프리 프레임워크인 4DGS360을 소개합니다. 기존 방법들은 2D 기반 사전 지식에 크게 의존함에 따라 초기 포인트들이 각 학습 뷰에서 보이는 표면에 과적합되어 일관된 360° 기하학적 구조 재구성에 종종 실패합니다. 4DGS360은 가려진 영역의 기하학적 모호성을 완화하는 고급 3D 기반 초기화를 통해 이 문제를 해결합니다. 저희가 제안하는 3D 트래커인 AnchorTAP3D는 신뢰도 높은 2D 트랙 포인트를 앵커로 활용하여 강화된 3D 포인트 궤적을 생성하며, 드리프트를 억제하고 가려진 영역의 기하학적 구조를 보존하는 신뢰할 수 있는 초기화를 제공합니다. 이러한 초기화와 최적화를 결합하여 일관성 있는 360° 4D 재구성을 얻습니다. 또한 학습 뷰와 최대 135°까지 떨어진 테스트 카메라를 배치하여 기존 데이터셋이 제공하지 못하는 360° 평가가 가능한 새로운 벤치마크인 iPhone360을 제시합니다. 실험 결과, 4DGS360은 iPhone360, iPhone 및 DAVIS 데이터셋에서 정성적, 정량적으로 최첨단 성능을 달성함을 보여줍니다.

English

We introduce 4DGS360, a diffusion-free framework for 360^{circ} dynamic object reconstruction from casual monocular video. Existing methods often fail to reconstruct consistent 360^{circ} geometry, as their heavy reliance on 2D-native priors causes initial points to overfit to visible surface in each training view. 4DGS360 addresses this challenge through a advanced 3D-native initialization that mitigates the geometric ambiguity of occluded regions. Our proposed 3D tracker, AnchorTAP3D, produces reinforced 3D point trajectories by leveraging confident 2D track points as anchors, suppressing drift and providing reliable initialization that preserves geometry in occluded regions. This initialization, combined with optimization, yields coherent 360^{circ} 4D reconstructions. We further present iPhone360, a new benchmark where test cameras are placed up to 135^{circ} apart from training views, enabling 360^{circ} evaluation that existing datasets cannot provide. Experiments show that 4DGS360 achieves state-of-the-art performance on the iPhone360, iPhone, and DAVIS datasets, both qualitatively and quantitatively.

4DGS360: 단일 동영상으로부터 동적 객체의 360도 가우시안 복원

4DGS360: 360° Gaussian Reconstruction of Dynamic Objects from a Single Video

초록

Support