Hibou: Een Familie van Fundamentele Vision Transformers voor Pathologie

Samenvatting

Pathologie, de microscopische analyse van ziek weefsel, is essentieel voor het diagnosticeren van verschillende medische aandoeningen, met name kanker. Traditionele methoden zijn arbeidsintensief en gevoelig voor menselijke fouten. Digitale pathologie, waarbij glasplaatjes worden omgezet in hoogwaardige digitale afbeeldingen voor analyse door computeralgoritmen, revolutioneert het vakgebied door de diagnostische nauwkeurigheid, consistentie en efficiëntie te verbeteren via geautomatiseerde beeldanalyse en grootschalige gegevensverwerking. Fundamentele transformer-pretraining is cruciaal voor het ontwikkelen van robuuste, generaliseerbare modellen, omdat het leren van grote hoeveelheden niet-geannoteerde data mogelijk maakt. Dit artikel introduceert de Hibou-familie van fundamentele vision transformers voor pathologie, waarbij het DINOv2-framework wordt gebruikt om twee modelvarianten, Hibou-B en Hibou-L, te pretrainen op een propriëtaire dataset van meer dan 1 miljoen whole slide images (WSI’s) die diverse weefseltypen en kleuringstechnieken vertegenwoordigen. Onze gepretrainde modellen tonen superieure prestaties op zowel patch- als slide-level benchmarks en overtreffen bestaande state-of-the-art methoden. Met name Hibou-L behaalt de hoogste gemiddelde nauwkeurigheid over meerdere benchmarkdatasets. Om verder onderzoek en toepassing in het veld te ondersteunen, hebben we het Hibou-B model open-source gemaakt, dat toegankelijk is op https://github.com/HistAI/hibou.

English

Pathology, the microscopic examination of diseased tissue, is critical for diagnosing various medical conditions, particularly cancers. Traditional methods are labor-intensive and prone to human error. Digital pathology, which converts glass slides into high-resolution digital images for analysis by computer algorithms, revolutionizes the field by enhancing diagnostic accuracy, consistency, and efficiency through automated image analysis and large-scale data processing. Foundational transformer pretraining is crucial for developing robust, generalizable models as it enables learning from vast amounts of unannotated data. This paper introduces the Hibou family of foundational vision transformers for pathology, leveraging the DINOv2 framework to pretrain two model variants, Hibou-B and Hibou-L, on a proprietary dataset of over 1 million whole slide images (WSIs) representing diverse tissue types and staining techniques. Our pretrained models demonstrate superior performance on both patch-level and slide-level benchmarks, surpassing existing state-of-the-art methods. Notably, Hibou-L achieves the highest average accuracy across multiple benchmark datasets. To support further research and application in the field, we have open-sourced the Hibou-B model, which can be accessed at https://github.com/HistAI/hibou

Hibou: Een Familie van Fundamentele Vision Transformers voor Pathologie

Hibou: A Family of Foundational Vision Transformers for Pathology

Samenvatting

Support