ChatPaper.aiChatPaper

通过俳句式表达连接空间生物学与临床组织学

Linking spatial biology and clinical histology via Haiku

April 30, 2026
作者: Yan Cui, Jacob S. Leiby, Wenhui Lei, Dokyoon Kim, Yanxiang Deng, Aaron T. Mayer, Zhenqin Wu, Alexandro E. Trevino, Zhi Huang
cs.AI

摘要

整合分子、形态学与临床数据对基础与转化生物医学研究至关重要,但目前仍缺乏系统性的多模态联合建模框架。本文提出Haiku——一种基于多重免疫荧光(mIF)训练的三模态对比学习模型。该模型整合了来自1,606名患者、涵盖11种器官类型的3,218个组织切片,包含2,670万个空间蛋白质组学图像块,并与苏木精-伊红(H&E)染色组织学图像及临床元数据共同嵌入共享表征空间。Haiku支持三向跨模态检索,在下游分类与临床预测任务中表现优于单模态基线,还能通过仅基于临床文本描述的融合检索实现零样本生物标志物推断。在多项任务中,Haiku均超越现有方法:跨模态检索(Recall@50达0.611,基线接近零)、生存预测(C指数0.737,相对提升7.91%)及零样本生物标志物推断(52种生物标志物的平均皮尔逊相关系数为0.718)。此外,我们引入反事实预测框架,在固定组织形态的前提下仅修改临床元数据,可揭示与乳腺癌分期进展和肺癌生存结局相关的微环境特异性分子变化。在肺腺癌案例研究中,反事实分析捕捉到以CD8和颗粒酶B升高、PD-L1降低及Ki67减少为特征的微环境变化模式,该模式与既往报道的良性预后特征高度吻合。需要说明的是,这些反事实结果应视为探索性的假设生成信号,而非机制性结论。Haiku的三模态对齐能力实现了空间生物学的整合分析,为在临床背景下探索分子测量数据搭建了桥梁。
English
Integrating molecular, morphological, and clinical data is essential for basic and translational biomedical research, yet systematic frameworks for jointly modeling these modalities remain limited. Here we present Haiku, a tri-modal contrastive learning model trained on multiplexed immunofluorescence (mIF). It comprises 26.7 million spatial proteomics patches from 3,218 tissue sections across 1,606 patients spanning 11 organ types, with matched hematoxylin and eosin (H&E) histology and clinical metadata aligned in a shared embedding space. Haiku enables three-way cross-modal retrieval, improves downstream classification and clinical prediction tasks over unimodal baselines, and supports zero-shot biomarker inference through fusion retrieval conditioned on clinical metadata-only text descriptions. Across tasks, Haiku outperforms competing approaches, achieving cross-modal retrieval (Recall@50 up to 0.611 versus near-zero baseline), survival prediction (C-index 0.737, +7.91% relative improvement), and zero-shot biomarker inference (mean Pearson correlation 0.718 across 52 biomarkers). Furthermore, we introduce a counterfactual prediction framework in which modifying only clinical metadata while fixing tissue morphology surfaces niche-specific molecular shifts associated with breast cancer stage progression and lung cancer survival outcomes. In a lung adenocarcinoma case study, the counterfactual analysis recovers niche-specific shifts characterized by increased CD8 and granzyme B, reduced PD-L1, and decreased Ki67, broadly consistent with patterns reported for favorable outcomes. We present these counterfactual results as exploratory, hypothesis-generating signals rather than mechanistic claims. These capabilities demonstrate that tri-modal alignment via Haiku enables integrative analysis of spatial biology, bridging molecular measurements with clinical context for biological exploration.
PDF01May 6, 2026