基於語義對齊的二維高斯潑濺圖像修復
2D Gaussian Splatting with Semantic Alignment for Image Inpainting
September 2, 2025
作者: Hongyu Li, Chaofeng Chen, Xiaoming Li, Guangming Lu
cs.AI
摘要
高斯潑濺(Gaussian Splatting, GS)作為一種將離散點轉化為連續空間表示的最新技術,在3D場景建模和2D圖像超分辨率領域已展現出顯著成效。本文探討了其在圖像修復中的未開發潛力,該任務既要求局部像素合成的連貫性,又需全局語義恢復的一致性。我們提出了首個基於2D高斯潑濺的圖像修復框架,該框架將不完整圖像編碼為2D高斯潑濺係數的連續場,並通過可微分的柵格化過程重建最終圖像。GS的連續渲染範式本質上促進了修復結果在像素層面的連貫性。為提升效率與可擴展性,我們引入了一種分塊柵格化策略,有效降低了記憶體開銷並加速了推理過程。針對全局語義一致性,我們整合了預訓練DINO模型的特徵。我們觀察到,DINO的全局特徵對小範圍缺失區域具有天然的魯棒性,並能有效適應於指導大遮罩場景下的語義對齊,確保修復內容與周圍場景在語境上保持一致。在標準基準上的大量實驗表明,我們的方法在定量指標與感知質量上均達到了競爭力,為高斯潑濺在2D圖像處理中的應用開闢了新方向。
English
Gaussian Splatting (GS), a recent technique for converting discrete points
into continuous spatial representations, has shown promising results in 3D
scene modeling and 2D image super-resolution. In this paper, we explore its
untapped potential for image inpainting, which demands both locally coherent
pixel synthesis and globally consistent semantic restoration. We propose the
first image inpainting framework based on 2D Gaussian Splatting, which encodes
incomplete images into a continuous field of 2D Gaussian splat coefficients and
reconstructs the final image via a differentiable rasterization process. The
continuous rendering paradigm of GS inherently promotes pixel-level coherence
in the inpainted results. To improve efficiency and scalability, we introduce a
patch-wise rasterization strategy that reduces memory overhead and accelerates
inference. For global semantic consistency, we incorporate features from a
pretrained DINO model. We observe that DINO's global features are naturally
robust to small missing regions and can be effectively adapted to guide
semantic alignment in large-mask scenarios, ensuring that the inpainted content
remains contextually consistent with the surrounding scene. Extensive
experiments on standard benchmarks demonstrate that our method achieves
competitive performance in both quantitative metrics and perceptual quality,
establishing a new direction for applying Gaussian Splatting to 2D image
processing.