ReMaX: Ontspanning voor Betere Training op Efficiënte Panoptische Segmentatie

Samenvatting

Dit artikel introduceert een nieuw mechanisme om de training van mask-transformatoren voor efficiënte panoptische segmentatie te vergemakkelijken, waardoor de implementatie ervan wordt gedemocratiseerd. We merken op dat vanwege de hoge complexiteit, het trainingsdoel van panoptische segmentatie onvermijdelijk leidt tot een veel hogere straf voor fout-positieven. Een dergelijk onevenwichtig verlies maakt het trainingsproces van end-to-end mask-transformatorgebaseerde architecturen moeilijk, vooral voor efficiënte modellen. In dit artikel presenteren we ReMaX, dat tijdens de training voor panoptische segmentatie ontspanning toevoegt aan maskvoorspellingen en klassevoorspellingen. We tonen aan dat via deze eenvoudige ontspanningstechnieken tijdens de training, ons model consistent met een duidelijke marge kan worden verbeterd zonder extra rekenkosten tijdens de inferentie. Door onze methode te combineren met efficiënte backbones zoals MobileNetV3-Small, behaalt onze methode nieuwe state-of-the-art resultaten voor efficiënte panoptische segmentatie op COCO, ADE20K en Cityscapes. Code en vooraf getrainde checkpoints zullen beschikbaar zijn op https://github.com/google-research/deeplab2.

English

This paper presents a new mechanism to facilitate the training of mask transformers for efficient panoptic segmentation, democratizing its deployment. We observe that due to its high complexity, the training objective of panoptic segmentation will inevitably lead to much higher false positive penalization. Such unbalanced loss makes the training process of the end-to-end mask-transformer based architectures difficult, especially for efficient models. In this paper, we present ReMaX that adds relaxation to mask predictions and class predictions during training for panoptic segmentation. We demonstrate that via these simple relaxation techniques during training, our model can be consistently improved by a clear margin without any extra computational cost on inference. By combining our method with efficient backbones like MobileNetV3-Small, our method achieves new state-of-the-art results for efficient panoptic segmentation on COCO, ADE20K and Cityscapes. Code and pre-trained checkpoints will be available at https://github.com/google-research/deeplab2.

ReMaX: Ontspanning voor Betere Training op Efficiënte Panoptische Segmentatie

ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

Samenvatting

Support