Hibou:用于病理学的基础视觉Transformer家族
Hibou: A Family of Foundational Vision Transformers for Pathology
June 7, 2024
作者: Dmitry Nechaev, Alexey Pchelnikov, Ekaterina Ivanova
cs.AI
摘要
病理学是对患病组织进行显微检查,对于诊断各种医学疾病尤其是癌症至关重要。传统方法耗时且容易出现人为错误。数字病理学将玻璃切片转换为高分辨率数字图像,供计算机算法分析,通过自动化图像分析和大规模数据处理,革新了该领域,提高了诊断准确性、一致性和效率。基础变压器预训练对于开发稳健、具有泛化能力的模型至关重要,因为它能够从大量未标记数据中学习。
本文介绍了用于病理学的基础视觉变压器Hibou系列,利用DINOv2框架预训练了两个模型变体,Hibou-B和Hibou-L,使用了超过100万张代表多种组织类型和染色技术的专有数据集。我们的预训练模型在补丁级和切片级基准测试中表现出色,超越了现有的最先进方法。值得注意的是,Hibou-L在多个基准数据集上实现了最高的平均准确率。为了支持该领域的进一步研究和应用,我们已经开源了Hibou-B模型,可在https://github.com/HistAI/hibou 上获取。
English
Pathology, the microscopic examination of diseased tissue, is critical for
diagnosing various medical conditions, particularly cancers. Traditional
methods are labor-intensive and prone to human error. Digital pathology, which
converts glass slides into high-resolution digital images for analysis by
computer algorithms, revolutionizes the field by enhancing diagnostic accuracy,
consistency, and efficiency through automated image analysis and large-scale
data processing. Foundational transformer pretraining is crucial for developing
robust, generalizable models as it enables learning from vast amounts of
unannotated data.
This paper introduces the Hibou family of foundational vision transformers
for pathology, leveraging the DINOv2 framework to pretrain two model variants,
Hibou-B and Hibou-L, on a proprietary dataset of over 1 million whole slide
images (WSIs) representing diverse tissue types and staining techniques. Our
pretrained models demonstrate superior performance on both patch-level and
slide-level benchmarks, surpassing existing state-of-the-art methods. Notably,
Hibou-L achieves the highest average accuracy across multiple benchmark
datasets. To support further research and application in the field, we have
open-sourced the Hibou-B model, which can be accessed at
https://github.com/HistAI/hibouSummary
AI-Generated Summary