NeoVerse：利用野外单目视频增强4D世界模型

摘要

本文提出NeoVerse——一个通用的4维世界模型，能够实现4维重建、新轨迹视频生成及丰富的下游应用。我们首先指出当前4维世界建模方法普遍存在的可扩展性局限，这些局限源于昂贵的专业多视角4维数据或繁琐的训练预处理。相比之下，NeoVerse基于核心设计理念，使完整流程能够灵活适配多样化的单目野外视频。具体而言，NeoVerse具备无需位姿标注的前馈式4维重建、在线单目退化模式模拟等高度协同的技术方案。这些设计使NeoVerse在多种领域均展现出卓越的通用性与泛化能力。同时，该模型在标准重建与生成基准测试中达到了最先进性能。项目页面详见：https://neoverse-4d.github.io

English

In this paper, we propose NeoVerse, a versatile 4D world model that is capable of 4D reconstruction, novel-trajectory video generation, and rich downstream applications. We first identify a common limitation of scalability in current 4D world modeling methods, caused either by expensive and specialized multi-view 4D data or by cumbersome training pre-processing. In contrast, our NeoVerse is built upon a core philosophy that makes the full pipeline scalable to diverse in-the-wild monocular videos. Specifically, NeoVerse features pose-free feed-forward 4D reconstruction, online monocular degradation pattern simulation, and other well-aligned techniques. These designs empower NeoVerse with versatility and generalization to various domains. Meanwhile, NeoVerse achieves state-of-the-art performance in standard reconstruction and generation benchmarks. Our project page is available at https://neoverse-4d.github.io