Styl3R:面向任意场景与风格的即时三维风格化重建
Styl3R: Instant 3D Stylized Reconstruction for Arbitrary Scenes and Styles
May 27, 2025
作者: Peng Wang, Xiang Liu, Peidong Liu
cs.AI
摘要
在保持多视角一致性的同时,即时地对3D场景进行风格化处理,并忠实于风格图像,仍然是一个重大挑战。当前最先进的3D风格化方法通常需要在测试时进行计算密集型的优化,以将艺术特征转移到预训练的3D表示中,且往往需要密集的带姿态输入图像。相比之下,我们利用前馈重建模型的最新进展,展示了一种新颖的方法,能够在不到一秒的时间内,使用未定位的稀疏视角场景图像和任意风格图像,实现直接的3D风格化。为了解决重建与风格化之间固有的解耦问题,我们引入了一种分支架构,将结构建模与外观着色分离,有效防止风格迁移扭曲底层的3D场景结构。此外,我们采用了一种身份损失,通过新颖视图合成任务来促进风格化模型的预训练。这一策略还使我们的模型在微调以适应风格化的同时,保留了原有的重建能力。通过使用领域内和领域外数据集的全面评估,我们证明了该方法能够生成高质量的3D风格化内容,实现风格与场景外观的卓越融合,同时在多视角一致性和效率方面也优于现有方法。
English
Stylizing 3D scenes instantly while maintaining multi-view consistency and
faithfully resembling a style image remains a significant challenge. Current
state-of-the-art 3D stylization methods typically involve computationally
intensive test-time optimization to transfer artistic features into a
pretrained 3D representation, often requiring dense posed input images. In
contrast, leveraging recent advances in feed-forward reconstruction models, we
demonstrate a novel approach to achieve direct 3D stylization in less than a
second using unposed sparse-view scene images and an arbitrary style image. To
address the inherent decoupling between reconstruction and stylization, we
introduce a branched architecture that separates structure modeling and
appearance shading, effectively preventing stylistic transfer from distorting
the underlying 3D scene structure. Furthermore, we adapt an identity loss to
facilitate pre-training our stylization model through the novel view synthesis
task. This strategy also allows our model to retain its original reconstruction
capabilities while being fine-tuned for stylization. Comprehensive evaluations,
using both in-domain and out-of-domain datasets, demonstrate that our approach
produces high-quality stylized 3D content that achieve a superior blend of
style and scene appearance, while also outperforming existing methods in terms
of multi-view consistency and efficiency.Summary
AI-Generated Summary