ObjectGS: 가우시안 스플래팅을 통한 객체 인식 기반 장면 재구성 및 장면 이해

초록

3D 가우시안 스플래팅(3D Gaussian Splatting)은 높은 정밀도의 재구성과 실시간 새로운 시점 합성으로 유명하지만, 의미론적 이해의 부재로 인해 객체 수준의 인식이 제한적입니다. 본 연구에서는 3D 장면 재구성과 의미론적 이해를 통합한 객체 인식 프레임워크인 ObjectGS를 제안합니다. ObjectGS는 장면을 통합된 전체로 취급하는 대신, 개별 객체를 신경 가우시안을 생성하고 객체 ID를 공유하는 로컬 앵커로 모델링함으로써 정밀한 객체 수준의 재구성을 가능하게 합니다. 학습 과정에서 이러한 앵커를 동적으로 확장하거나 제거하며 특징을 최적화하고, 원-핫(one-hot) ID 인코딩과 분류 손실을 통해 명확한 의미론적 제약을 강제합니다. 광범위한 실험을 통해 ObjectGS가 개방형 어휘(open-vocabulary) 및 범용 분할(panoptic segmentation) 작업에서 최신 방법을 능가할 뿐만 아니라, 메시 추출 및 장면 편집과 같은 애플리케이션과도 원활하게 통합됨을 입증합니다. 프로젝트 페이지: https://ruijiezhu94.github.io/ObjectGS_page

English

3D Gaussian Splatting is renowned for its high-fidelity reconstructions and real-time novel view synthesis, yet its lack of semantic understanding limits object-level perception. In this work, we propose ObjectGS, an object-aware framework that unifies 3D scene reconstruction with semantic understanding. Instead of treating the scene as a unified whole, ObjectGS models individual objects as local anchors that generate neural Gaussians and share object IDs, enabling precise object-level reconstruction. During training, we dynamically grow or prune these anchors and optimize their features, while a one-hot ID encoding with a classification loss enforces clear semantic constraints. We show through extensive experiments that ObjectGS not only outperforms state-of-the-art methods on open-vocabulary and panoptic segmentation tasks, but also integrates seamlessly with applications like mesh extraction and scene editing. Project page: https://ruijiezhu94.github.io/ObjectGS_page

ObjectGS: 가우시안 스플래팅을 통한 객체 인식 기반 장면 재구성 및 장면 이해

ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting

초록

Support