AtlasPatch: Een Efficiënt en Schaalbaar Hulpmiddel voor Preprocessing van Whole Slide Images in Computationele Pathologie

Samenvatting

Voorbewerking van whole-slide images (WSI's), doorgaans bestaande uit weefseldetectie gevolgd door patchextractie, vormt de basis van AI-gestuurde rekenpathologie-workflows. Dit blijft een groot computationeel knelpunt, omdat bestaande tools ofwel vertrouwen op onnauwkeurige heuristische thresholding voor weefseldetectie, of AI-gebaseerde benaderingen gebruiken die zijn getraind op data met beperkte diversiteit en opereren op patchniveau, wat aanzienlijke computationele complexiteit met zich meebrengt. Wij presenteren AtlasPatch, een efficiënt en schaalbaar raamwerk voor de voorbewerking van slides voor accurate weefseldetectie en hoogwaardige patchextractie met minimale computationele overhead. AtlasPatch's weefseldetectiemodule is getraind op een heterogene en semi-handmatig geannoteerde dataset van ~30.000 WSI-miniaturen, met behulp van efficiënte fine-tuning van het Segment-Anything-model. De tool extrapoleert weefselmaskers van miniaturen naar slides met volledige resolutie om patchcoördinaten te extraheren bij door de gebruiker gespecificeerde vergrotingen, met opties om patches direct te streamen naar gangbare beeldencoders voor embedding of om patchafbeeldingen op te slaan, allemaal efficiënt geparallelliseerd over CPU's en GPU's. Wij evalueren AtlasPatch op segmentatienauwkeurigheid, computationele complexiteit en downstream multiple-instance learning, waarbij het prestaties bereikt die vergelijkbaar zijn met state-of-the-art, maar tegen een fractie van de computationele kosten. AtlasPatch is open-source en beschikbaar op https://github.com/AtlasAnalyticsLab/AtlasPatch.

English

Whole-slide image (WSI) preprocessing, typically comprising tissue detection followed by patch extraction, is foundational to AI-driven computational pathology workflows. This remains a major computational bottleneck as existing tools either rely on inaccurate heuristic thresholding for tissue detection, or adopt AI-based approaches trained on limited-diversity data that operate at the patch level, incurring substantial computational complexity. We present AtlasPatch, an efficient and scalable slide preprocessing framework for accurate tissue detection and high-throughput patch extraction with minimal computational overhead. AtlasPatch's tissue detection module is trained on a heterogeneous and semi-manually annotated dataset of ~30,000 WSI thumbnails, using efficient fine-tuning of the Segment-Anything model. The tool extrapolates tissue masks from thumbnails to full-resolution slides to extract patch coordinates at user-specified magnifications, with options to stream patches directly into common image encoders for embedding or store patch images, all efficiently parallelized across CPUs and GPUs. We assess AtlasPatch across segmentation precision, computational complexity, and downstream multiple-instance learning, matching state-of-the-art performance while operating at a fraction of their computational cost. AtlasPatch is open-source and available at https://github.com/AtlasAnalyticsLab/AtlasPatch.

AtlasPatch: Een Efficiënt en Schaalbaar Hulpmiddel voor Preprocessing van Whole Slide Images in Computationele Pathologie

AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology

Samenvatting

Support