快速分割任何物体

摘要

最近提出的任意分割模型（SAM）在许多计算机视觉任务中产生了重大影响。它正在成为许多高级任务的基础步骤，如图像分割、图像描述和图像编辑。然而，其巨大的计算成本阻碍了它在工业场景中更广泛的应用。这种计算主要来自于高分辨率输入下的Transformer架构。在本文中，我们提出了一种加速替代方法，用于这一基础任务，具有可比较的性能。通过将任务重新表述为分段生成和提示，我们发现一个常规的CNN检测器与实例分割分支也可以很好地完成这项任务。具体来说，我们将这个任务转换为众所周知的实例分割任务，并直接训练现有的实例分割方法，仅使用SAM作者发布的SA-1B数据集的1/50。通过我们的方法，我们实现了与SAM方法可比较的性能，运行速度提高了50倍。我们提供充分的实验结果来证明其有效性。代码和演示将在https://github.com/CASIA-IVA-Lab/FastSAM 上发布。

English

The recently proposed segment anything model (SAM) has made a significant influence in many computer vision tasks. It is becoming a foundation step for many high-level tasks, like image segmentation, image caption, and image editing. However, its huge computation costs prevent it from wider applications in industry scenarios. The computation mainly comes from the Transformer architecture at high-resolution inputs. In this paper, we propose a speed-up alternative method for this fundamental task with comparable performance. By reformulating the task as segments-generation and prompting, we find that a regular CNN detector with an instance segmentation branch can also accomplish this task well. Specifically, we convert this task to the well-studied instance segmentation task and directly train the existing instance segmentation method using only 1/50 of the SA-1B dataset published by SAM authors. With our method, we achieve a comparable performance with the SAM method at 50 times higher run-time speed. We give sufficient experimental results to demonstrate its effectiveness. The codes and demos will be released at https://github.com/CASIA-IVA-Lab/FastSAM.

快速分割任何物体

Fast Segment Anything

摘要

Support