장면 중심 비지도 범위 분할

초록

비지도 범위 분할(Unsupervised panoptic segmentation)은 수동으로 주석이 달린 데이터에 대한 학습 없이 이미지를 의미론적으로 의미 있는 영역과 구별되는 객체 인스턴스로 분할하는 것을 목표로 합니다. 기존의 비지도 범위 장면 이해 연구와 달리, 우리는 객체 중심의 학습 데이터 필요성을 제거함으로써 복잡한 장면의 비지도 이해를 가능하게 합니다. 이를 위해, 우리는 장면 중심 이미지에 직접 학습하는 최초의 비지도 범위 방법을 제시합니다. 특히, 시각적 표현, 깊이, 그리고 움직임 단서를 결합하여 복잡한 장면 중심 데이터에서 고해상도 범위 가짜 레이블을 얻는 접근 방식을 제안합니다. 가짜 레이블 학습과 범위 자기 학습 전략을 모두 활용함으로써, 인간의 주석 없이도 복잡한 장면의 범위 분할을 정확하게 예측하는 새로운 접근 방식을 제안합니다. 우리의 접근 방식은 범위 품질을 크게 개선하며, 예를 들어 Cityscapes 데이터셋에서 최신 비지도 범위 분할 기술을 PQ 기준으로 9.4% 포인트 앞섭니다.

English

Unsupervised panoptic segmentation aims to partition an image into semantically meaningful regions and distinct object instances without training on manually annotated data. In contrast to prior work on unsupervised panoptic scene understanding, we eliminate the need for object-centric training data, enabling the unsupervised understanding of complex scenes. To that end, we present the first unsupervised panoptic method that directly trains on scene-centric imagery. In particular, we propose an approach to obtain high-resolution panoptic pseudo labels on complex scene-centric data, combining visual representations, depth, and motion cues. Utilizing both pseudo-label training and a panoptic self-training strategy yields a novel approach that accurately predicts panoptic segmentation of complex scenes without requiring any human annotations. Our approach significantly improves panoptic quality, e.g., surpassing the recent state of the art in unsupervised panoptic segmentation on Cityscapes by 9.4% points in PQ.

장면 중심 비지도 범위 분할

Scene-Centric Unsupervised Panoptic Segmentation

초록

Support