비지도 범용 이미지 분할

초록

고밀도 수동 주석 분할 마스크의 필요성을 제거한 여러 비지도 이미지 분할 접근법이 제안되었으며, 현재 모델들은 시맨틱 분할(예: STEGO) 또는 클래스 불문 인스턴스 분할(예: CutLER) 중 하나만을 개별적으로 처리할 뿐, 둘 다(즉, 팬옵틱 분할)를 동시에 처리하지는 못합니다. 우리는 다양한 이미지 분할 작업(인스턴스, 시맨틱, 팬옵틱)을 수행할 수 있는 비지도 범용 분할 모델(U2Seg)을 제안합니다. U2Seg는 새로운 통합 프레임워크를 사용하여 이러한 분할 작업을 위한 가짜 시맨틱 레이블을 생성합니다. 이는 자기 지도 모델을 활용한 후 클러스터링을 통해 이루어지며, 각 클러스터는 픽셀의 서로 다른 시맨틱 및/또는 인스턴스 소속을 나타냅니다. 그런 다음 모델을 이러한 가짜 시맨틱 레이블에 대해 자기 학습시켜, 각 작업에 맞춤화된 전문 방법들보다 상당한 성능 향상을 달성합니다: COCO에서의 비지도 인스턴스 분할에서 CutLER 대비 +2.6 AP^{box} 향상, COCOStuff에서의 비지도 시맨틱 분할에서 STEGO 대비 +7.0 PixelAcc 증가를 보입니다. 더욱이, 우리의 방법은 이전에 탐구되지 않았던 비지도 팬옵틱 분할을 위한 새로운 기준을 설정합니다. U2Seg는 또한 소량 데이터(예: COCO 레이블의 1%만 사용)로 학습할 때 CutLER를 +5.0 AP^{mask}로 능가하는 강력한 소량 학습 분할을 위한 사전 학습 모델입니다. 우리의 간단하지만 효과적인 방법이 비지도 범용 이미지 분할에 대한 더 많은 연구를 영감을 줄 수 있기를 바랍니다.

English

Several unsupervised image segmentation approaches have been proposed which eliminate the need for dense manually-annotated segmentation masks; current models separately handle either semantic segmentation (e.g., STEGO) or class-agnostic instance segmentation (e.g., CutLER), but not both (i.e., panoptic segmentation). We propose an Unsupervised Universal Segmentation model (U2Seg) adept at performing various image segmentation tasks -- instance, semantic and panoptic -- using a novel unified framework. U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models followed by clustering; each cluster represents different semantic and/or instance membership of pixels. We then self-train the model on these pseudo semantic labels, yielding substantial performance gains over specialized methods tailored to each task: a +2.6 AP^{box} boost vs. CutLER in unsupervised instance segmentation on COCO and a +7.0 PixelAcc increase (vs. STEGO) in unsupervised semantic segmentation on COCOStuff. Moreover, our method sets up a new baseline for unsupervised panoptic segmentation, which has not been previously explored. U2Seg is also a strong pretrained model for few-shot segmentation, surpassing CutLER by +5.0 AP^{mask} when trained on a low-data regime, e.g., only 1% COCO labels. We hope our simple yet effective method can inspire more research on unsupervised universal image segmentation.

비지도 범용 이미지 분할

Unsupervised Universal Image Segmentation

초록

Support