ChatPaper.aiChatPaper

手术 SAM 2:通过高效帧修剪在手术视频中实时分割任何物体

Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

August 15, 2024
作者: Haofeng Liu, Erli Zhang, Junde Wu, Mingxuan Hong, Yueming Jin
cs.AI

摘要

在计算辅助手术中,手术视频分割是一项关键任务,对提高手术质量和患者预后至关重要。最近,Segment Anything Model 2(SAM2)框架在图像和视频分割方面展示出卓越的进展。然而,由于处理高分辨率图像和手术视频中复杂且远程时间动态的高计算需求,SAM2效率不高。为了解决这些挑战,我们引入了Surgical SAM 2(SurgSAM-2),这是一个先进的模型,利用了SAM2与高效帧修剪(EFP)机制,以促进实时手术视频分割。EFP机制通过有选择地保留仅最具信息量的帧来动态管理内存库,从而减少内存使用和计算成本,同时保持高分割准确性。我们的广泛实验表明,与原始SAM2相比,SurgSAM-2显著提高了效率和分割准确性。值得注意的是,SurgSAM-2在与SAM2相比,FPS提高了3倍,同时在与低分辨率数据微调后也提供了最先进的性能。这些进展将SurgSAM-2确立为手术视频分析的领先模型,使资源受限环境下的实时手术视频分割成为可能。
English
Surgical video segmentation is a critical task in computer-assisted surgery and is vital for enhancing surgical quality and patient outcomes. Recently, the Segment Anything Model 2 (SAM2) framework has shown superior advancements in image and video segmentation. However, SAM2 struggles with efficiency due to the high computational demands of processing high-resolution images and complex and long-range temporal dynamics in surgical videos. To address these challenges, we introduce Surgical SAM 2 (SurgSAM-2), an advanced model to utilize SAM2 with an Efficient Frame Pruning (EFP) mechanism, to facilitate real-time surgical video segmentation. The EFP mechanism dynamically manages the memory bank by selectively retaining only the most informative frames, reducing memory usage and computational cost while maintaining high segmentation accuracy. Our extensive experiments demonstrate that SurgSAM-2 significantly improves both efficiency and segmentation accuracy compared to the vanilla SAM2. Remarkably, SurgSAM-2 achieves a 3times FPS compared with SAM2, while also delivering state-of-the-art performance after fine-tuning with lower-resolution data. These advancements establish SurgSAM-2 as a leading model for surgical video analysis, making real-time surgical video segmentation in resource-constrained environments a feasible reality.

Summary

AI-Generated Summary

PDF222November 26, 2024