AtlasPatch:一种高效可扩展的计算病理学全切片图像预处理工具
AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology
February 3, 2026
作者: Ahmed Alagha, Christopher Leclerc, Yousef Kotp, Omar Metwally, Calvin Moras, Peter Rentopoulos, Ghodsiyeh Rostami, Bich Ngoc Nguyen, Jumanah Baig, Abdelhakim Khellaf, Vincent Quoc-Huy Trinh, Rabeb Mizouni, Hadi Otrok, Jamal Bentahar, Mahdi S. Hosseini
cs.AI
摘要
全切片图像预处理作为AI驱动计算病理学流程的基础环节,通常包含组织检测与组织块提取两个步骤。由于现有工具要么依赖准确性有限的启发式阈值分割进行组织检测,要么采用基于有限多样性数据训练的补丁级AI方法导致计算复杂度激增,该过程仍是主要计算瓶颈。我们提出AtlasPatch——一种高效可扩展的切片预处理框架,能以最小计算开销实现精准组织检测与高通量组织块提取。该框架的组织检测模块通过对约3万张异质性半人工标注的WSI缩略图数据集进行Segment-Anything模型的高效微调训练,可将组织掩码从缩略图外推至全分辨率切片,并在用户指定放大倍数下提取组织块坐标,支持将组织块直接流式输入常见图像编码器生成嵌入向量或存储图像文件,所有操作均可实现CPU与GPU的高效并行处理。我们在分割精度、计算复杂度及下游多示例学习任务中评估AtlasPatch,其性能达到业界最优水平的同时仅需极低计算成本。本工具已开源发布于https://github.com/AtlasAnalyticsLab/AtlasPatch。
English
Whole-slide image (WSI) preprocessing, typically comprising tissue detection followed by patch extraction, is foundational to AI-driven computational pathology workflows. This remains a major computational bottleneck as existing tools either rely on inaccurate heuristic thresholding for tissue detection, or adopt AI-based approaches trained on limited-diversity data that operate at the patch level, incurring substantial computational complexity. We present AtlasPatch, an efficient and scalable slide preprocessing framework for accurate tissue detection and high-throughput patch extraction with minimal computational overhead. AtlasPatch's tissue detection module is trained on a heterogeneous and semi-manually annotated dataset of ~30,000 WSI thumbnails, using efficient fine-tuning of the Segment-Anything model. The tool extrapolates tissue masks from thumbnails to full-resolution slides to extract patch coordinates at user-specified magnifications, with options to stream patches directly into common image encoders for embedding or store patch images, all efficiently parallelized across CPUs and GPUs. We assess AtlasPatch across segmentation precision, computational complexity, and downstream multiple-instance learning, matching state-of-the-art performance while operating at a fraction of their computational cost. AtlasPatch is open-source and available at https://github.com/AtlasAnalyticsLab/AtlasPatch.