WorldFlow3D：穿越三维分布的无界世界生成流

摘要

无界三维世界生成正逐渐成为计算机视觉、图形学与机器人学中场景建模的基础任务。本研究提出WorldFlow3D这一能够生成无界三维世界的新方法。基于流匹配的基本特性——即定义两个数据分布间的传输路径——我们将三维生成更广义地建模为在三维数据分布间的流动问题，而非局限于条件去噪。研究发现，这种无潜变量的流方法能生成具有因果性的精确三维结构，并可将其作为中间分布来指导更复杂结构与高质量纹理的生成，且收敛速度远超现有方法。通过向量化场景布局条件实现几何结构控制，结合场景属性实现视觉纹理控制，本方法实现了生成场景的可控性。我们在真实户外驾驶场景与合成室内场景上验证了WorldFlow3D的有效性，证实其具有跨领域泛化能力及在真实数据分布上的高质量生成性能。在所有无界场景生成的测试设定中，本方法均展现出优于对比方案的场景生成保真度。详情请参阅：https://light.princeton.edu/worldflow3d。

English

Unbounded 3D world generation is emerging as a foundational task for scene modeling in computer vision, graphics, and robotics. In this work, we present WorldFlow3D, a novel method capable of generating unbounded 3D worlds. Building upon a foundational property of flow matching - namely, defining a path of transport between two data distributions - we model 3D generation more generally as a problem of flowing through 3D data distributions, not limited to conditional denoising. We find that our latent-free flow approach generates causal and accurate 3D structure, and can use this as an intermediate distribution to guide the generation of more complex structure and high-quality texture - all while converging more rapidly than existing methods. We enable controllability over generated scenes with vectorized scene layout conditions for geometric structure control and visual texture control through scene attributes. We confirm the effectiveness of WorldFlow3D on both real outdoor driving scenes and synthetic indoor scenes, validating cross-domain generalizability and high-quality generation on real data distributions. We confirm favorable scene generation fidelity over approaches in all tested settings for unbounded scene generation. For more, see https://light.princeton.edu/worldflow3d.

WorldFlow3D：穿越三维分布的无界世界生成流

WorldFlow3D: Flowing Through 3D Distributions for Unbounded World Generation

摘要

Support