3DitScene:通过语言引导的离散高斯飞溅编辑任何场景
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
May 28, 2024
作者: Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang
cs.AI
摘要
场景图像编辑对于娱乐、摄影和广告设计至关重要。现有方法仅专注于2D个体对象或3D全局场景编辑。这导致缺乏一种统一的方法来有效地控制和操作具有不同粒度级别的3D场景。在这项工作中,我们提出了3DitScene,这是一种新颖的统一场景编辑框架,利用语言引导的分解高斯光斑,实现了从2D到3D的无缝编辑,允许对场景构图和个体对象进行精确控制。我们首先引入了经过生成先验和优化技术优化的3D高斯函数。然后,来自CLIP的语言特征将语义引入3D几何中,用于对象分解。通过分解的高斯函数,3DitScene允许在全局和个体级别进行操作,彻底改变了创意表达方式,增强了对场景和对象的控制能力。实验结果展示了3DitScene在场景图像编辑中的有效性和多功能性。代码和在线演示可在我们的项目主页找到:https://zqh0253.github.io/3DitScene/。
English
Scene image editing is crucial for entertainment, photography, and
advertising design. Existing methods solely focus on either 2D individual
object or 3D global scene editing. This results in a lack of a unified approach
to effectively control and manipulate scenes at the 3D level with different
levels of granularity. In this work, we propose 3DitScene, a novel and unified
scene editing framework leveraging language-guided disentangled Gaussian
Splatting that enables seamless editing from 2D to 3D, allowing precise control
over scene composition and individual objects. We first incorporate 3D
Gaussians that are refined through generative priors and optimization
techniques. Language features from CLIP then introduce semantics into 3D
geometry for object disentanglement. With the disentangled Gaussians, 3DitScene
allows for manipulation at both the global and individual levels,
revolutionizing creative expression and empowering control over scenes and
objects. Experimental results demonstrate the effectiveness and versatility of
3DitScene in scene image editing. Code and online demo can be found at our
project homepage: https://zqh0253.github.io/3DitScene/.Summary
AI-Generated Summary