네오버스: 단안 비디오를 활용한 4차원 세계 모델 향상

초록

본 논문에서는 4D 재구성, 새로운 궤적 비디오 생성, 다양한 다운스트림 응용이 가능한 다목적 4D 월드 모델인 NeoVerse를 제안한다. 우리는 먼저 고비용의 전문적인 다중 뷰 4D 데이터나 복잡한 학습 전처리로 인해 발생하는 현재 4D 월드 모델링 방법의 확장성 한계를 지적한다. 이와 대조적으로 우리의 NeoVerse는 다양한 실제 단안 비디오에 대해 전체 파이프라인의 확장성을 보장하는 핵심 철학에 기반한다. 구체적으로 NeoVerse는 포즈 추정 없이 순전파 방식으로 진행되는 4D 재구성, 온라인 단안 열화 패턴 시뮬레이션 및 이와 잘 정렬된 기법들을 특징으로 한다. 이러한 설계로 NeoVerse는 다양한 도메인에 걸쳐 다용도성과 일반화 성능을 갖춘다. 동시에 NeoVerse는 표준 재구성 및 생성 벤치마크에서 최첨단 성능을 달성한다. 프로젝트 페이지는 https://neoverse-4d.github.io에서 확인할 수 있다.

English

In this paper, we propose NeoVerse, a versatile 4D world model that is capable of 4D reconstruction, novel-trajectory video generation, and rich downstream applications. We first identify a common limitation of scalability in current 4D world modeling methods, caused either by expensive and specialized multi-view 4D data or by cumbersome training pre-processing. In contrast, our NeoVerse is built upon a core philosophy that makes the full pipeline scalable to diverse in-the-wild monocular videos. Specifically, NeoVerse features pose-free feed-forward 4D reconstruction, online monocular degradation pattern simulation, and other well-aligned techniques. These designs empower NeoVerse with versatility and generalization to various domains. Meanwhile, NeoVerse achieves state-of-the-art performance in standard reconstruction and generation benchmarks. Our project page is available at https://neoverse-4d.github.io

네오버스: 단안 비디오를 활용한 4차원 세계 모델 향상

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

초록

Support