无监督通用图像分割

摘要

已经提出了几种无监督图像分割方法，这些方法消除了对密集手动标注的分割掩模的需求；当前的模型分别处理语义分割（例如，STEGO）或类别不可知实例分割（例如，CutLER），但不是同时处理这两种（即，全景分割）。我们提出了一种无监督通用分割模型（U2Seg），能够使用一种新颖的统一框架执行各种图像分割任务 -- 实例、语义和全景 --。U2Seg通过利用自监督模型生成这些分割任务的伪语义标签，然后进行聚类；每个聚类代表像素的不同语义和/或实例成员资格。然后我们对这些伪语义标签进行自训练，相对于针对每个任务量身定制的专门方法，取得了显著的性能提升：在COCO上，无监督实例分割中相对于CutLER的+2.6 AP^{box}提升，无监督语义分割中相对于STEGO的+7.0 PixelAcc增加。此外，我们的方法为无监督全景分割建立了一个新的基准，这是以前未曾探索过的。U2Seg还是一个强大的预训练模型，用于少样本分割，在低数据情况下训练时，例如仅使用1%的COCO标签时，相对于CutLER，AP^{mask}提升了+5.0。我们希望我们简单而有效的方法能够激发更多关于无监督通用图像分割的研究。

English

Several unsupervised image segmentation approaches have been proposed which eliminate the need for dense manually-annotated segmentation masks; current models separately handle either semantic segmentation (e.g., STEGO) or class-agnostic instance segmentation (e.g., CutLER), but not both (i.e., panoptic segmentation). We propose an Unsupervised Universal Segmentation model (U2Seg) adept at performing various image segmentation tasks -- instance, semantic and panoptic -- using a novel unified framework. U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models followed by clustering; each cluster represents different semantic and/or instance membership of pixels. We then self-train the model on these pseudo semantic labels, yielding substantial performance gains over specialized methods tailored to each task: a +2.6 AP^{box} boost vs. CutLER in unsupervised instance segmentation on COCO and a +7.0 PixelAcc increase (vs. STEGO) in unsupervised semantic segmentation on COCOStuff. Moreover, our method sets up a new baseline for unsupervised panoptic segmentation, which has not been previously explored. U2Seg is also a strong pretrained model for few-shot segmentation, surpassing CutLER by +5.0 AP^{mask} when trained on a low-data regime, e.g., only 1% COCO labels. We hope our simple yet effective method can inspire more research on unsupervised universal image segmentation.

无监督通用图像分割

Unsupervised Universal Image Segmentation

摘要

Support