WorldGrow：无限3D世界生成

摘要

我们致力于解决无限扩展三维世界的生成难题——即创建具有连贯几何结构与逼真外观的大规模连续环境。现有方法面临关键挑战：基于二维提升的方法存在多视角间的几何与外观不一致问题，三维隐式表示难以扩展规模，而当前三维基础模型大多以物体为中心，限制了其在场景级生成中的应用。我们的核心思路是利用预训练三维模型中的强生成先验进行结构化场景块生成。为此，我们提出WorldGrow这一支持无边界三维场景合成的分层框架。该方法包含三大核心组件：（1）通过数据筛选流程提取高质量场景块进行训练，使三维结构化潜在表征适用于场景生成；（2）采用三维场景块修复机制实现上下文感知的场景扩展；（3）通过由粗到细的生成策略确保全局布局合理性与局部几何/纹理保真度。在大规模3D-FRONT数据集上的评估表明，WorldGrow在几何重建方面达到业界最优性能，同时独树一帜地支持生成具有照片级真实感与结构一致性的无限场景。这些成果彰显了其构建大规模虚拟环境的能力，以及构建未来世界模型的潜力。

English

We tackle the challenge of generating the infinitely extendable 3D world -- large, continuous environments with coherent geometry and realistic appearance. Existing methods face key challenges: 2D-lifting approaches suffer from geometric and appearance inconsistencies across views, 3D implicit representations are hard to scale up, and current 3D foundation models are mostly object-centric, limiting their applicability to scene-level generation. Our key insight is leveraging strong generation priors from pre-trained 3D models for structured scene block generation. To this end, we propose WorldGrow, a hierarchical framework for unbounded 3D scene synthesis. Our method features three core components: (1) a data curation pipeline that extracts high-quality scene blocks for training, making the 3D structured latent representations suitable for scene generation; (2) a 3D block inpainting mechanism that enables context-aware scene extension; and (3) a coarse-to-fine generation strategy that ensures both global layout plausibility and local geometric/textural fidelity. Evaluated on the large-scale 3D-FRONT dataset, WorldGrow achieves SOTA performance in geometry reconstruction, while uniquely supporting infinite scene generation with photorealistic and structurally consistent outputs. These results highlight its capability for constructing large-scale virtual environments and potential for building future world models.

WorldGrow：无限3D世界生成

WorldGrow: Generating Infinite 3D World

摘要

Support