ReMaX: 효율적인 파노픽틱 분할을 위한 더 나은 학습을 위한 완화 기법

초록

본 논문은 효율적인 팬옵틱 분할을 위한 마스크 트랜스포머의 학습을 용이하게 하는 새로운 메커니즘을 제시하며, 이를 통해 팬옵틱 분할의 배포를 대중화합니다. 우리는 팬옵틱 분할의 학습 목표가 높은 복잡성으로 인해 필연적으로 더 높은 거짓 양성 패널티를 초래한다는 것을 관찰했습니다. 이러한 불균형 손실은 종단간(end-to-end) 마스크 트랜스포머 기반 아키텍처의 학습 과정을 어렵게 만들며, 특히 효율적인 모델에서 더욱 그러합니다. 본 논문에서는 팬옵틱 분할 학습 중에 마스크 예측과 클래스 예측에 완화를 추가하는 ReMaX를 제안합니다. 우리는 이러한 간단한 완화 기법을 통해 모델이 추론 시 추가적인 계산 비용 없이도 명확한 차이로 일관되게 개선될 수 있음을 입증합니다. MobileNetV3-Small과 같은 효율적인 백본과 우리의 방법을 결합함으로써, COCO, ADE20K 및 Cityscapes 데이터셋에서 효율적인 팬옵틱 분할을 위한 새로운 최첨단 결과를 달성합니다. 코드와 사전 학습된 체크포인트는 https://github.com/google-research/deeplab2에서 제공될 예정입니다.

English

This paper presents a new mechanism to facilitate the training of mask transformers for efficient panoptic segmentation, democratizing its deployment. We observe that due to its high complexity, the training objective of panoptic segmentation will inevitably lead to much higher false positive penalization. Such unbalanced loss makes the training process of the end-to-end mask-transformer based architectures difficult, especially for efficient models. In this paper, we present ReMaX that adds relaxation to mask predictions and class predictions during training for panoptic segmentation. We demonstrate that via these simple relaxation techniques during training, our model can be consistently improved by a clear margin without any extra computational cost on inference. By combining our method with efficient backbones like MobileNetV3-Small, our method achieves new state-of-the-art results for efficient panoptic segmentation on COCO, ADE20K and Cityscapes. Code and pre-trained checkpoints will be available at https://github.com/google-research/deeplab2.

ReMaX: 효율적인 파노픽틱 분할을 위한 더 나은 학습을 위한 완화 기법

ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

초록

Support