ReMaX：効率的なパノプティックセグメンテーションのためのトレーニング改善に向けたリラクゼーション

要旨

本論文では、効率的なパンオプティックセグメンテーションのためのマスクトランスフォーマーの学習を促進し、その展開を民主化する新しいメカニズムを提案します。パンオプティックセグメンテーションの学習目的は、その高い複雑さゆえに、必然的に偽陽性に対するペナルティが大幅に高くなることを観察しました。このような不均衡な損失は、エンドツーエンドのマスクトランスフォーマーベースのアーキテクチャ、特に効率的なモデルの学習プロセスを困難にします。本論文では、パンオプティックセグメンテーションの学習中にマスク予測とクラス予測に緩和を加えるReMaXを提案します。学習中にこれらの単純な緩和技術を用いることで、推論時の追加計算コストなしに、モデルを明確なマージンで一貫して改善できることを実証します。MobileNetV3-Smallのような効率的なバックボーンと本手法を組み合わせることで、COCO、ADE20K、Cityscapesにおける効率的なパンオプティックセグメンテーションの新たな最先端結果を達成します。コードと事前学習済みチェックポイントはhttps://github.com/google-research/deeplab2で公開予定です。

English

This paper presents a new mechanism to facilitate the training of mask transformers for efficient panoptic segmentation, democratizing its deployment. We observe that due to its high complexity, the training objective of panoptic segmentation will inevitably lead to much higher false positive penalization. Such unbalanced loss makes the training process of the end-to-end mask-transformer based architectures difficult, especially for efficient models. In this paper, we present ReMaX that adds relaxation to mask predictions and class predictions during training for panoptic segmentation. We demonstrate that via these simple relaxation techniques during training, our model can be consistently improved by a clear margin without any extra computational cost on inference. By combining our method with efficient backbones like MobileNetV3-Small, our method achieves new state-of-the-art results for efficient panoptic segmentation on COCO, ADE20K and Cityscapes. Code and pre-trained checkpoints will be available at https://github.com/google-research/deeplab2.

ReMaX：効率的なパノプティックセグメンテーションのためのトレーニング改善に向けたリラクゼーション

ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

要旨

Support