WonderZoom：多尺度三维世界生成技术

摘要

我们提出WonderZoom，这是一种从单张图像生成跨越多空间尺度的3D场景内容的新方法。现有3D世界生成模型仍局限于单尺度合成，无法在不同粒度上生成连贯的场景内容。其根本挑战在于缺乏能够生成并渲染空间尺寸差异巨大内容的尺度感知3D表征。WonderZoom通过两大创新突破该限制：(1) 采用尺度自适应高斯面元实现多尺度3D场景的生成与实时渲染；(2) 设计渐进式细节合成器迭代生成更精细尺度的3D内容。该方法支持用户"放大"3D区域，通过自回归方式从景观到微观特征逐级合成原本不存在的精细细节。实验表明，WonderZoom在生成质量与尺度一致性上显著优于当前最先进的视频与3D模型，实现了从单张图像创建多尺度3D世界。视频结果与生成多尺度3D世界的交互式查看器详见https://wonderzoom.github.io/。

English

We present WonderZoom, a novel approach to generating 3D scenes with contents across multiple spatial scales from a single image. Existing 3D world generation models remain limited to single-scale synthesis and cannot produce coherent scene contents at varying granularities. The fundamental challenge is the lack of a scale-aware 3D representation capable of generating and rendering content with largely different spatial sizes. WonderZoom addresses this through two key innovations: (1) scale-adaptive Gaussian surfels for generating and real-time rendering of multi-scale 3D scenes, and (2) a progressive detail synthesizer that iteratively generates finer-scale 3D contents. Our approach enables users to "zoom into" a 3D region and auto-regressively synthesize previously non-existent fine details from landscapes to microscopic features. Experiments demonstrate that WonderZoom significantly outperforms state-of-the-art video and 3D models in both quality and alignment, enabling multi-scale 3D world creation from a single image. We show video results and an interactive viewer of generated multi-scale 3D worlds in https://wonderzoom.github.io/

WonderZoom：多尺度三维世界生成技术

WonderZoom: Multi-Scale 3D World Generation

摘要

Support