GSFixer: 참조 기반 비디오 확산 사전 지식을 활용한 3D 가우시안 스플래팅 개선

초록

희소 뷰(sparse views)를 사용하여 3D Gaussian Splatting(3DGS)으로 3D 장면을 재구성하는 것은 정보가 부족하여 잘 정의되지 않은 문제(ill-posed problem)로, 종종 눈에 띄는 아티팩트(artifacts)를 초래합니다. 최근 접근법들은 생성적 사전 정보(generative priors)를 활용하여 제약이 적은 영역의 정보를 완성하려고 시도했지만, 입력 관측값과 일관된 콘텐츠를 생성하는 데 어려움을 겪고 있습니다. 이러한 문제를 해결하기 위해, 우리는 희소 입력으로부터 재구성된 3DGS 표현의 품질을 향상시키기 위해 설계된 새로운 프레임워크인 GSFixer를 제안합니다. 우리의 접근법의 핵심은 DiT 기반 비디오 확산 모델(video diffusion model)을 기반으로 구축된 참조 기반 비디오 복원 모델(reference-guided video restoration model)입니다. 이 모델은 아티팩트가 있는 3DGS 렌더링과 깨끗한 프레임을 추가적인 참조 기반 조건과 함께 훈련시켰습니다. 입력 희소 뷰를 참조로 고려하여, 우리의 모델은 시각적 기하학 기반 모델(visual geometry foundation model)에서 추출한 참조 뷰의 2D 의미론적 특징(semantic features)과 3D 기하학적 특징(geometric features)을 통합하여, 아티팩트가 있는 새로운 뷰를 수정할 때 의미론적 일관성(semantic coherence)과 3D 일관성(3D consistency)을 강화합니다. 또한, 3DGS 아티팩트 복원 평가를 위한 적절한 벤치마크가 부족한 점을 고려하여, 우리는 저품질 3DGS를 사용하여 렌더링된 아티팩트 프레임을 포함하는 DL3DV-Res를 제시합니다. 광범위한 실험을 통해 우리의 GSFixer가 3DGS 아티팩트 복원 및 희소 뷰 3D 재구성에서 현재 최신 방법들을 능가함을 입증합니다. 프로젝트 페이지: https://github.com/GVCLab/GSFixer.

English

Reconstructing 3D scenes using 3D Gaussian Splatting (3DGS) from sparse views is an ill-posed problem due to insufficient information, often resulting in noticeable artifacts. While recent approaches have sought to leverage generative priors to complete information for under-constrained regions, they struggle to generate content that remains consistent with input observations. To address this challenge, we propose GSFixer, a novel framework designed to improve the quality of 3DGS representations reconstructed from sparse inputs. The core of our approach is the reference-guided video restoration model, built upon a DiT-based video diffusion model trained on paired artifact 3DGS renders and clean frames with additional reference-based conditions. Considering the input sparse views as references, our model integrates both 2D semantic features and 3D geometric features of reference views extracted from the visual geometry foundation model, enhancing the semantic coherence and 3D consistency when fixing artifact novel views. Furthermore, considering the lack of suitable benchmarks for 3DGS artifact restoration evaluation, we present DL3DV-Res which contains artifact frames rendered using low-quality 3DGS. Extensive experiments demonstrate our GSFixer outperforms current state-of-the-art methods in 3DGS artifact restoration and sparse-view 3D reconstruction. Project page: https://github.com/GVCLab/GSFixer.