비대칭 이중 3D 가우시안 스플래팅을 활용한 야외 환경에서의 강건한 신경 렌더링

초록

야외 이미지에서의 3D 재구성은 일관되지 않은 조명 조건과 일시적인 방해 요소로 인해 여전히 어려운 과제로 남아 있습니다. 기존 방법들은 일반적으로 낮은 품질의 학습 데이터를 처리하기 위해 휴리스틱 전략에 의존하는데, 이는 종종 안정적이고 일관된 재구성을 생성하는 데 어려움을 겪으며 시각적 아티팩트를 자주 발생시킵니다. 본 연구에서는 이러한 아티팩트의 확률적 특성을 활용한 새로운 프레임워크인 Asymmetric Dual 3DGS를 제안합니다: 이 아티팩트들은 사소한 무작위성으로 인해 서로 다른 학습 실행 간에 변동하는 경향이 있습니다. 구체적으로, 우리의 방법은 두 개의 3D Gaussian Splatting (3DGS) 모델을 병렬로 학습시키며, 일관성 제약을 적용하여 신뢰할 수 있는 장면 기하학에 수렴하도록 유도하고 일관되지 않은 아티팩트를 억제합니다. 두 모델이 확인 편향으로 인해 유사한 실패 모드로 수렴하는 것을 방지하기 위해, 우리는 두 가지 상호 보완적인 마스크를 적용하는 divergent masking 전략을 도입했습니다: 다중 단서 적응형 마스크와 자기 지도 소프트 마스크로, 이는 두 모델의 비대칭적 학습 과정을 유도하여 공유 오류 모드를 줄입니다. 또한, 모델 학습의 효율성을 향상시키기 위해 Dynamic EMA Proxy라는 경량 변형을 도입했습니다. 이는 두 모델 중 하나를 동적으로 업데이트되는 지수 이동 평균(EMA) 프록시로 대체하고, 교대 마스킹 전략을 사용하여 분산을 유지합니다. 도전적인 실제 데이터셋에 대한 광범위한 실험을 통해 우리의 방법이 기존 접근법을 일관되게 능가하면서도 높은 효율성을 달성함을 입증했습니다. 코드와 학습된 모델은 공개될 예정입니다.

English

3D reconstruction from in-the-wild images remains a challenging task due to inconsistent lighting conditions and transient distractors. Existing methods typically rely on heuristic strategies to handle the low-quality training data, which often struggle to produce stable and consistent reconstructions, frequently resulting in visual artifacts. In this work, we propose Asymmetric Dual 3DGS, a novel framework that leverages the stochastic nature of these artifacts: they tend to vary across different training runs due to minor randomness. Specifically, our method trains two 3D Gaussian Splatting (3DGS) models in parallel, enforcing a consistency constraint that encourages convergence on reliable scene geometry while suppressing inconsistent artifacts. To prevent the two models from collapsing into similar failure modes due to confirmation bias, we introduce a divergent masking strategy that applies two complementary masks: a multi-cue adaptive mask and a self-supervised soft mask, which leads to an asymmetric training process of the two models, reducing shared error modes. In addition, to improve the efficiency of model training, we introduce a lightweight variant called Dynamic EMA Proxy, which replaces one of the two models with a dynamically updated Exponential Moving Average (EMA) proxy, and employs an alternating masking strategy to preserve divergence. Extensive experiments on challenging real-world datasets demonstrate that our method consistently outperforms existing approaches while achieving high efficiency. Codes and trained models will be released.

비대칭 이중 3D 가우시안 스플래팅을 활용한 야외 환경에서의 강건한 신경 렌더링

Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting

초록

Support