WorldFlow3D：穿越三维分布的无限世界生成流

摘要

无边界三维世界生成正逐渐成为计算机视觉、图形学与机器人学中场景建模的基础任务。本研究提出WorldFlow3D这一创新方法，能够生成无边界三维世界。基于流匹配的基本特性——即定义两个数据分布间的传输路径——我们将三维生成更广义地建模为在三维数据分布间的流动问题，而非局限于条件去噪。研究发现，我们的无隐变量流方法能生成具有因果性的精确三维结构，并可将其作为中间分布来引导更复杂结构及高质量纹理的生成，且收敛速度远超现有方法。通过向量化场景布局条件实现几何结构控制，结合场景属性进行视觉纹理控制，我们实现了对生成场景的可控性。在真实户外驾驶场景与合成室内场景上的实验证实了WorldFlow3D的有效性，验证了其跨领域泛化能力及在真实数据分布上的高质量生成效果。在所有测试场景中，本方法在无边界场景生成任务上均展现出优于现有技术的生成保真度。详见https://light.princeton.edu/worldflow3d。

English

Unbounded 3D world generation is emerging as a foundational task for scene modeling in computer vision, graphics, and robotics. In this work, we present WorldFlow3D, a novel method capable of generating unbounded 3D worlds. Building upon a foundational property of flow matching - namely, defining a path of transport between two data distributions - we model 3D generation more generally as a problem of flowing through 3D data distributions, not limited to conditional denoising. We find that our latent-free flow approach generates causal and accurate 3D structure, and can use this as an intermediate distribution to guide the generation of more complex structure and high-quality texture - all while converging more rapidly than existing methods. We enable controllability over generated scenes with vectorized scene layout conditions for geometric structure control and visual texture control through scene attributes. We confirm the effectiveness of WorldFlow3D on both real outdoor driving scenes and synthetic indoor scenes, validating cross-domain generalizability and high-quality generation on real data distributions. We confirm favorable scene generation fidelity over approaches in all tested settings for unbounded scene generation. For more, see https://light.princeton.edu/worldflow3d.

WorldFlow3D：穿越三维分布的无限世界生成流

WorldFlow3D: Flowing Through 3D Distributions for Unbounded World Generation

摘要

Support