Hibou：一系列用於病理學的基礎視覺轉換器

摘要

病理學是對患病組織的顯微檢查，對於診斷各種醫學狀況特別是癌症至關重要。傳統方法耗時且容易出現人為錯誤。數位病理學將玻璃切片轉換為高解析度數位影像，供電腦演算法分析，通過自動化影像分析和大規模數據處理提高診斷準確性、一致性和效率，徹底改變了這一領域。基礎變壓器預訓練對於開發強大且具有一般化能力的模型至關重要，因為它使模型能夠從大量未標記數據中學習。本文介紹了 Hibou 系列基礎視覺變壓器，用 DINOv2 框架預訓練兩個模型變體，Hibou-B 和 Hibou-L，在一個擁有超過一百萬張代表多種組織類型和染色技術的專有數據集上進行。我們的預訓練模型在補丁級和切片級基準測試中展示出卓越的性能，超越了現有的最先進方法。值得注意的是，Hibou-L 在多個基準數據集上實現了最高的平均準確性。為了支持該領域的進一步研究和應用，我們已將 Hibou-B 模型開源，可在 https://github.com/HistAI/hibou 上獲取。

English

Pathology, the microscopic examination of diseased tissue, is critical for diagnosing various medical conditions, particularly cancers. Traditional methods are labor-intensive and prone to human error. Digital pathology, which converts glass slides into high-resolution digital images for analysis by computer algorithms, revolutionizes the field by enhancing diagnostic accuracy, consistency, and efficiency through automated image analysis and large-scale data processing. Foundational transformer pretraining is crucial for developing robust, generalizable models as it enables learning from vast amounts of unannotated data. This paper introduces the Hibou family of foundational vision transformers for pathology, leveraging the DINOv2 framework to pretrain two model variants, Hibou-B and Hibou-L, on a proprietary dataset of over 1 million whole slide images (WSIs) representing diverse tissue types and staining techniques. Our pretrained models demonstrate superior performance on both patch-level and slide-level benchmarks, surpassing existing state-of-the-art methods. Notably, Hibou-L achieves the highest average accuracy across multiple benchmark datasets. To support further research and application in the field, we have open-sourced the Hibou-B model, which can be accessed at https://github.com/HistAI/hibou

Hibou：一系列用於病理學的基礎視覺轉換器

Hibou: A Family of Foundational Vision Transformers for Pathology

摘要

Support