3DitScene:透過語言引導的解耦高斯塗抹編輯任何場景
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
May 28, 2024
作者: Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang
cs.AI
摘要
場景圖像編輯對於娛樂、攝影和廣告設計至關重要。現有方法僅專注於2D個別物件或3D全局場景編輯。這導致缺乏一種統一的方法來有效控制和操作不同粒度的3D場景。在這項工作中,我們提出了3DitScene,一種新穎且統一的場景編輯框架,利用語言引導的解耦高斯擴散,實現從2D到3D的無縫編輯,從而精確控制場景組成和個別物件。我們首先將通過生成先驗和優化技術進行改進的3D高斯結合到其中。然後,來自CLIP的語言特徵將語義引入3D幾何中,以進行物件解耦。通過解耦的高斯,3DitScene允許在全局和個別層面進行操作,徹底改變了創意表達方式,增強了對場景和物件的控制。實驗結果展示了3DitScene在場景圖像編輯中的有效性和多功能性。代碼和在線演示可在我們的項目主頁找到:https://zqh0253.github.io/3DitScene/。
English
Scene image editing is crucial for entertainment, photography, and
advertising design. Existing methods solely focus on either 2D individual
object or 3D global scene editing. This results in a lack of a unified approach
to effectively control and manipulate scenes at the 3D level with different
levels of granularity. In this work, we propose 3DitScene, a novel and unified
scene editing framework leveraging language-guided disentangled Gaussian
Splatting that enables seamless editing from 2D to 3D, allowing precise control
over scene composition and individual objects. We first incorporate 3D
Gaussians that are refined through generative priors and optimization
techniques. Language features from CLIP then introduce semantics into 3D
geometry for object disentanglement. With the disentangled Gaussians, 3DitScene
allows for manipulation at both the global and individual levels,
revolutionizing creative expression and empowering control over scenes and
objects. Experimental results demonstrate the effectiveness and versatility of
3DitScene in scene image editing. Code and online demo can be found at our
project homepage: https://zqh0253.github.io/3DitScene/.Summary
AI-Generated Summary