快速分割任何物件

摘要

最近提出的「Segment Anything Model」（SAM）在許多電腦視覺任務中產生了顯著影響。它正在成為許多高級任務的基礎步驟，如圖像分割、圖像標題和圖像編輯。然而，其龐大的計算成本阻礙了它在工業場景中更廣泛應用。這種計算主要來自於高分辨率輸入的Transformer架構。在本文中，我們提出了一種加速替代方法，用於這個基本任務，並具有可比擬的性能。通過將任務重新定義為分段生成和提示，我們發現一個常規的CNN檢測器與一個實例分割分支也可以很好地完成這個任務。具體來說，我們將這個任務轉換為廣泛研究的實例分割任務，並直接訓練現有的實例分割方法，僅使用SAM作者發布的SA-1B數據集的1/50。通過我們的方法，我們實現了與SAM方法可比擬的性能，運行速度提高了50倍。我們提供充分的實驗結果來證明其有效性。代碼和演示將在https://github.com/CASIA-IVA-Lab/FastSAM 上發布。

English

The recently proposed segment anything model (SAM) has made a significant influence in many computer vision tasks. It is becoming a foundation step for many high-level tasks, like image segmentation, image caption, and image editing. However, its huge computation costs prevent it from wider applications in industry scenarios. The computation mainly comes from the Transformer architecture at high-resolution inputs. In this paper, we propose a speed-up alternative method for this fundamental task with comparable performance. By reformulating the task as segments-generation and prompting, we find that a regular CNN detector with an instance segmentation branch can also accomplish this task well. Specifically, we convert this task to the well-studied instance segmentation task and directly train the existing instance segmentation method using only 1/50 of the SA-1B dataset published by SAM authors. With our method, we achieve a comparable performance with the SAM method at 50 times higher run-time speed. We give sufficient experimental results to demonstrate its effectiveness. The codes and demos will be released at https://github.com/CASIA-IVA-Lab/FastSAM.

快速分割任何物件

Fast Segment Anything

摘要

Support