TinySAM: 효율적인 Segment Anything Model의 한계를 뛰어넘다

초록

최근 Segment Anything Model(SAM)은 강력한 세분화 능력을 보여주며 컴퓨터 비전 분야에서 큰 주목을 받고 있습니다. SAM을 기반으로 한 다양한 후속 연구들이 프리트레인된 SAM을 활용하여 다양한 애플리케이션을 개발하고 다운스트림 비전 작업에서 인상적인 성능을 달성했습니다. 그러나 SAM은 무거운 아키텍처로 구성되어 있으며 대규모 계산 능력을 필요로 하기 때문에, 계산 자원이 제한된 에지 디바이스에서의 추가적인 적용이 어렵습니다. 이를 위해 본 논문에서는 강력한 제로샷 성능을 유지하면서도 작은 Segment Anything Model(TinySAM)을 얻기 위한 프레임워크를 제안합니다. 먼저, 온라인 하드 프롬프트 샘플링 전략을 사용한 전체 단계 지식 증류 방법을 제안하여 경량화된 학생 모델을 증류합니다. 또한, 프롬프트 가능한 세분화 작업에 사후 훈련 양자화를 적용하여 계산 비용을 더욱 줄였습니다. 더 나아가, 계층적인 모든 것 세분화 전략을 제안하여 모든 것 추론 속도를 2배 가속화하면서도 성능 저하가 거의 없도록 했습니다. 이러한 제안된 방법들을 통해 우리의 TinySAM은 계산량을 크게 줄이고 효율적인 Segment Anything 작업의 한계를 넓혔습니다. 다양한 제로샷 전이 작업에 대한 광범위한 실험을 통해 우리의 TinySAM이 비교 대상 방법들에 비해 현저히 우수한 성능을 보임을 입증했습니다. 사전 훈련된 모델과 코드는 https://github.com/xinghaochen/TinySAM과 https://gitee.com/mindspore/models/tree/master/research/cv/TinySAM에서 제공될 예정입니다.

English

Recently segment anything model (SAM) has shown powerful segmentation capability and has drawn great attention in computer vision fields. Massive following works have developed various applications based on the pretrained SAM and achieved impressive performance on downstream vision tasks. However, SAM consists of heavy architectures and requires massive computational capacity, which hinders the further application of SAM on computation constrained edge devices. To this end, in this paper we propose a framework to obtain a tiny segment anything model (TinySAM) while maintaining the strong zero-shot performance. We first propose a full-stage knowledge distillation method with online hard prompt sampling strategy to distill a lightweight student model. We also adapt the post-training quantization to the promptable segmentation task and further reduce the computational cost. Moreover, a hierarchical segmenting everything strategy is proposed to accelerate the everything inference by 2times with almost no performance degradation. With all these proposed methods, our TinySAM leads to orders of magnitude computational reduction and pushes the envelope for efficient segment anything task. Extensive experiments on various zero-shot transfer tasks demonstrate the significantly advantageous performance of our TinySAM against counterpart methods. Pre-trained models and codes will be available at https://github.com/xinghaochen/TinySAM and https://gitee.com/mindspore/models/tree/master/research/cv/TinySAM.

TinySAM: 효율적인 Segment Anything Model의 한계를 뛰어넘다

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

초록

Support