Hibou:一系列用於病理學的基礎視覺轉換器
Hibou: A Family of Foundational Vision Transformers for Pathology
June 7, 2024
作者: Dmitry Nechaev, Alexey Pchelnikov, Ekaterina Ivanova
cs.AI
摘要
病理學是對患病組織的顯微檢查,對於診斷各種醫學狀況特別是癌症至關重要。傳統方法耗時且容易出現人為錯誤。數位病理學將玻璃切片轉換為高解析度數位影像,供電腦演算法分析,通過自動化影像分析和大規模數據處理提高診斷準確性、一致性和效率,徹底改變了這一領域。基礎變壓器預訓練對於開發強大且具有一般化能力的模型至關重要,因為它使模型能夠從大量未標記數據中學習。
本文介紹了 Hibou 系列基礎視覺變壓器,用 DINOv2 框架預訓練兩個模型變體,Hibou-B 和 Hibou-L,在一個擁有超過一百萬張代表多種組織類型和染色技術的專有數據集上進行。我們的預訓練模型在補丁級和切片級基準測試中展示出卓越的性能,超越了現有的最先進方法。值得注意的是,Hibou-L 在多個基準數據集上實現了最高的平均準確性。為了支持該領域的進一步研究和應用,我們已將 Hibou-B 模型開源,可在 https://github.com/HistAI/hibou 上獲取。
English
Pathology, the microscopic examination of diseased tissue, is critical for
diagnosing various medical conditions, particularly cancers. Traditional
methods are labor-intensive and prone to human error. Digital pathology, which
converts glass slides into high-resolution digital images for analysis by
computer algorithms, revolutionizes the field by enhancing diagnostic accuracy,
consistency, and efficiency through automated image analysis and large-scale
data processing. Foundational transformer pretraining is crucial for developing
robust, generalizable models as it enables learning from vast amounts of
unannotated data.
This paper introduces the Hibou family of foundational vision transformers
for pathology, leveraging the DINOv2 framework to pretrain two model variants,
Hibou-B and Hibou-L, on a proprietary dataset of over 1 million whole slide
images (WSIs) representing diverse tissue types and staining techniques. Our
pretrained models demonstrate superior performance on both patch-level and
slide-level benchmarks, surpassing existing state-of-the-art methods. Notably,
Hibou-L achieves the highest average accuracy across multiple benchmark
datasets. To support further research and application in the field, we have
open-sourced the Hibou-B model, which can be accessed at
https://github.com/HistAI/hibouSummary
AI-Generated Summary