Vista4D: 4D 포인트 클라우드를 활용한 비디오 재촬영

초록

우리는 입력 비디오와 대상 카메라를 4D 포인트 클라우드에 정착(grounding)하는 강력하고 유연한 비디오 재촬영 프레임워크인 Vista4D를 제안합니다. 구체적으로, 입력 비디오가 주어지면 우리의 방법은 다른 카메라 궤적과 시점에서 동일한 역학을 가진 장면을 재합성합니다. 기존 비디오 재촬영 방법들은 실제 동적 비디오의 깊이 추정 아티팩트에 취약한 동시에 콘텐츠 외관을 보존하지 못하고 도전적인 새로운 궤적에 대한 정확한 카메라 제어를 유지하지 못하는 경우가 많습니다. 우리는 정적 픽셀 분할과 4D 재구성을 통해 4D에 정착된 포인트 클라우드 표현을 구축하여 관측된 콘텐츠를 명시적으로 보존하고 풍부한 카메라 신호를 제공하며, 실제 세계 추론 시 포인트 클라우드 아티팩트에 대한 강건성을 위해 재구성된 다중 시점 동적 데이터로 학습합니다. 우리의 결과는 다양한 비디오와 카메라 경로 하에서 최첨단 기준선들과 비교하여 향상된 4D 일관성, 카메라 제어 및 시각적 품질을 입증합니다. 또한, 우리의 방법은 동적 장면 확장 및 4D 장면 재구성과 같은 실제 응용 분야로 일반화됩니다. 결과, 코드 및 모델은 우리 프로젝트 페이지(https://eyeline-labs.github.io/Vista4D)에서 확인하실 수 있습니다.

English

We present Vista4D, a robust and flexible video reshooting framework that grounds the input video and target cameras in a 4D point cloud. Specifically, given an input video, our method re-synthesizes the scene with the same dynamics from a different camera trajectory and viewpoint. Existing video reshooting methods often struggle with depth estimation artifacts of real-world dynamic videos, while also failing to preserve content appearance and failing to maintain precise camera control for challenging new trajectories. We build a 4D-grounded point cloud representation with static pixel segmentation and 4D reconstruction to explicitly preserve seen content and provide rich camera signals, and we train with reconstructed multiview dynamic data for robustness against point cloud artifacts during real-world inference. Our results demonstrate improved 4D consistency, camera control, and visual quality compared to state-of-the-art baselines under a variety of videos and camera paths. Moreover, our method generalizes to real-world applications such as dynamic scene expansion and 4D scene recomposition. See our project page for results, code, and models: https://eyeline-labs.github.io/Vista4D

Vista4D: 4D 포인트 클라우드를 활용한 비디오 재촬영

Vista4D: Video Reshooting with 4D Point Clouds

초록

Support