面向千兆像素病理图像分析的多实例学习框架与掩码硬实例挖掘

摘要

将病理图像数字化为千兆像素级的全切片图像（WSIs）为计算病理学（CPath）开辟了新的研究途径。由于阳性组织仅占千兆像素WSIs的一小部分，现有的多实例学习（MIL）方法通常通过注意力机制来识别显著实例。然而，这导致了对易于分类实例的偏向，而忽视了具有挑战性的实例。近期研究表明，困难样本对于准确建模判别边界至关重要。在实例层面应用这一理念，我们提出了一种新颖的MIL框架——掩码困难实例挖掘（MHIM-MIL），该框架利用带有一致性约束的孪生网络结构来探索困难实例。MHIM-MIL通过类感知实例概率，采用动量教师模型来掩码显著实例，并隐式挖掘困难实例以训练学生模型。为了获取多样且非冗余的困难实例，我们采用大规模随机掩码策略，同时利用全局循环网络来降低丢失关键特征的风险。此外，学生模型通过指数移动平均更新教师模型，从而识别新的困难实例用于后续训练迭代，并稳定优化过程。在癌症诊断、亚型分类、生存分析任务以及12个基准测试上的实验结果表明，MHIM-MIL在性能和效率上均优于最新方法。代码已公开于：https://github.com/DearCaat/MHIM-MIL。

English

Digitizing pathological images into gigapixel Whole Slide Images (WSIs) has opened new avenues for Computational Pathology (CPath). As positive tissue comprises only a small fraction of gigapixel WSIs, existing Multiple Instance Learning (MIL) methods typically focus on identifying salient instances via attention mechanisms. However, this leads to a bias towards easy-to-classify instances while neglecting challenging ones. Recent studies have shown that hard examples are crucial for accurately modeling discriminative boundaries. Applying such an idea at the instance level, we elaborate a novel MIL framework with masked hard instance mining (MHIM-MIL), which utilizes a Siamese structure with a consistency constraint to explore the hard instances. Using a class-aware instance probability, MHIM-MIL employs a momentum teacher to mask salient instances and implicitly mine hard instances for training the student model. To obtain diverse, non-redundant hard instances, we adopt large-scale random masking while utilizing a global recycle network to mitigate the risk of losing key features. Furthermore, the student updates the teacher using an exponential moving average, which identifies new hard instances for subsequent training iterations and stabilizes optimization. Experimental results on cancer diagnosis, subtyping, survival analysis tasks, and 12 benchmarks demonstrate that MHIM-MIL outperforms the latest methods in both performance and efficiency. The code is available at: https://github.com/DearCaat/MHIM-MIL.

面向千兆像素病理图像分析的多实例学习框架与掩码硬实例挖掘

Multiple Instance Learning Framework with Masked Hard Instance Mining for Gigapixel Histopathology Image Analysis

摘要

Support