MOOZY：面向计算病理学的患者优先基础模型

摘要

计算病理学需要能够迁移至多种临床任务的全切片图像基础模型，但当前方法仍以切片为中心，常依赖私有数据和昂贵的配对报告监督，且未显式建模同一患者多张切片间的关联。我们提出MOOZY这一患者优先的病理基础模型，其以患者病例而非单张切片作为核心表征单元。MOOZY在预训练阶段通过病例Transformer显式建模同患者所有切片间的依赖关系，将多阶段开放式自监督与规模化低成本任务监督相结合。第一阶段，我们基于77,134个公开切片特征网格，通过掩码自蒸馏预训练纯视觉切片编码器。第二阶段，利用病例Transformer和来自56个公开数据集的333项任务（包括涵盖四个终点的205项分类任务与128项生存分析任务）进行多任务监督，将视觉表征与临床语义对齐。在八项保留任务的五折冻结特征探针评估中，MOOZY在多数指标上取得最优或并列最优表现，其加权F1值、加权ROC-AUC和平衡准确率的宏平均值较TITAN模型分别提升7.37%、5.50%和7.83%，较PRISM模型分别提升8.83%、10.70%和9.78%。该模型仅含8577万参数，较GigaPath缩小14倍，具有优异参数效率。这些结果表明，开放式、可复现的患者层级预训练能产生可迁移的表征，为构建可扩展的患者优先组织病理学基础模型提供了可行路径。

English

Computational pathology needs whole-slide image (WSI) foundation models that transfer across diverse clinical tasks, yet current approaches remain largely slide-centric, often depend on private data and expensive paired-report supervision, and do not explicitly model relationships among multiple slides from the same patient. We present MOOZY, a patient-first pathology foundation model in which the patient case, not the individual slide, is the core unit of representation. MOOZY explicitly models dependencies across all slides from the same patient via a case transformer during pretraining, combining multi-stage open self-supervision with scaled low-cost task supervision. In Stage 1, we pretrain a vision-only slide encoder on 77,134 public slide feature grids using masked self-distillation. In Stage 2, we align these representations with clinical semantics using a case transformer and multi-task supervision over 333 tasks from 56 public datasets, including 205 classification and 128 survival tasks across four endpoints. Across eight held-out tasks with five-fold frozen-feature probe evaluation, MOOZY achieves best or tied-best performance on most metrics and improves macro averages over TITAN by +7.37%, +5.50%, and +7.83% and over PRISM by +8.83%, +10.70%, and +9.78% for weighted F1, weighted ROC-AUC, and balanced accuracy, respectively. MOOZY is also parameter efficient with 85.77M parameters, 14x smaller than GigaPath. These results demonstrate that open, reproducible patient-level pretraining yields transferable embeddings, providing a practical path toward scalable patient-first histopathology foundation models.

MOOZY：面向计算病理学的患者优先基础模型

MOOZY: A Patient-First Foundation Model for Computational Pathology

摘要

Support