다중 뷰 3D 포인트 트래킹

초록

우리는 다중 카메라 뷰를 사용하여 동적 장면에서 임의의 점을 추적할 수 있는 최초의 데이터 기반 다중 뷰 3D 포인트 트래커를 소개합니다. 깊이 모호성과 폐색 문제로 어려움을 겪는 기존의 단안 트래커나, 20대 이상의 카메라와 번거로운 시퀀스별 최적화가 필요한 다중 카메라 방법과 달리, 우리의 피드포워드 모델은 실용적인 수의 카메라(예: 4대)를 사용하여 3D 대응점을 직접 예측함으로써 견고하고 정확한 온라인 추적을 가능하게 합니다. 알려진 카메라 포즈와 센서 기반 또는 추정된 다중 뷰 깊이 정보가 주어지면, 우리의 트래커는 다중 뷰 특징을 통합된 포인트 클라우드로 융합하고, k-최근접 이웃 상관 관계와 트랜스포머 기반 업데이트를 적용하여 폐색 상황에서도 장거리 3D 대응점을 안정적으로 추정합니다. 우리는 5,000개의 합성 다중 뷰 Kubric 시퀀스로 학습을 진행하고, Panoptic Studio와 DexYCB라는 두 가지 실제 벤치마크에서 각각 3.1cm와 2.0cm의 중간 궤적 오차를 달성했습니다. 우리의 방법은 1-8개의 다양한 시점을 가진 카메라 설정과 24-150 프레임의 비디오 길이에 잘 일반화됩니다. 트래커와 함께 학습 및 평가 데이터셋을 공개함으로써, 우리는 다중 뷰 3D 추적 연구에 새로운 기준을 제시하고 실제 응용을 위한 실용적인 도구를 제공하고자 합니다. 프로젝트 페이지는 https://ethz-vlg.github.io/mvtracker에서 확인할 수 있습니다.

English

We introduce the first data-driven multi-view 3D point tracker, designed to track arbitrary points in dynamic scenes using multiple camera views. Unlike existing monocular trackers, which struggle with depth ambiguities and occlusion, or prior multi-camera methods that require over 20 cameras and tedious per-sequence optimization, our feed-forward model directly predicts 3D correspondences using a practical number of cameras (e.g., four), enabling robust and accurate online tracking. Given known camera poses and either sensor-based or estimated multi-view depth, our tracker fuses multi-view features into a unified point cloud and applies k-nearest-neighbors correlation alongside a transformer-based update to reliably estimate long-range 3D correspondences, even under occlusion. We train on 5K synthetic multi-view Kubric sequences and evaluate on two real-world benchmarks: Panoptic Studio and DexYCB, achieving median trajectory errors of 3.1 cm and 2.0 cm, respectively. Our method generalizes well to diverse camera setups of 1-8 views with varying vantage points and video lengths of 24-150 frames. By releasing our tracker alongside training and evaluation datasets, we aim to set a new standard for multi-view 3D tracking research and provide a practical tool for real-world applications. Project page available at https://ethz-vlg.github.io/mvtracker.

다중 뷰 3D 포인트 트래킹

Multi-View 3D Point Tracking

초록

Support