灵枢细胞:面向虚拟细胞的转录组建模生成式细胞世界模型
Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells
March 26, 2026
作者: Han Zhang, Guo-Hua Yuan, Chaohao Yuan, Tingyang Xu, Tian Bian, Hong Cheng, Wenbing Huang, Deli Zhao, Yu Rong
cs.AI
摘要
建模细胞状态并预测其对扰动的响应,是计算生物学和虚拟细胞开发中的核心挑战。现有的单细胞转录组学基础模型能提供强大的静态表征,但尚未实现对细胞状态分布进行显式建模以支持生成式仿真。本文提出灵枢细胞(Lingshu-Cell)——一种掩码离散扩散模型,该模型能学习转录组状态分布并支持扰动条件下的条件仿真。通过直接作用于离散标记空间(该空间与单细胞转录组数据的稀疏性、非序列性特征相兼容),灵枢细胞无需依赖先验基因筛选(如按高变异性过滤或表达量排序),即可捕获约18,000个基因间复杂的全转录组表达依赖关系。在多种组织和物种中,灵枢细胞精准复现了转录组分布、标记基因表达模式和细胞亚型比例,证明了其捕捉复杂细胞异质性的能力。此外,通过将细胞类型或供体身份与扰动联合嵌入,该模型能预测身份与扰动新组合下的全转录组表达变化。在Virtual Cell Challenge H1基因扰动基准测试及人类PBMC细胞因子诱导响应预测中,灵枢细胞均取得领先性能。这些成果共同确立了灵枢细胞作为柔性细胞世界模型的地位,可用于细胞状态与扰动响应的计算机模拟,为生物发现和扰动筛选的新范式奠定基础。
English
Modeling cellular states and predicting their responses to perturbations are central challenges in computational biology and the development of virtual cells. Existing foundation models for single-cell transcriptomics provide powerful static representations, but they do not explicitly model the distribution of cellular states for generative simulation. Here, we introduce Lingshu-Cell, a masked discrete diffusion model that learns transcriptomic state distributions and supports conditional simulation under perturbation. By operating directly in a discrete token space that is compatible with the sparse, non-sequential nature of single-cell transcriptomic data, Lingshu-Cell captures complex transcriptome-wide expression dependencies across approximately 18,000 genes without relying on prior gene selection, such as filtering by high variability or ranking by expression level. Across diverse tissues and species, Lingshu-Cell accurately reproduces transcriptomic distributions, marker-gene expression patterns and cell-subtype proportions, demonstrating its ability to capture complex cellular heterogeneity. Moreover, by jointly embedding cell type or donor identity with perturbation, Lingshu-Cell can predict whole-transcriptome expression changes for novel combinations of identity and perturbation. It achieves leading performance on the Virtual Cell Challenge H1 genetic perturbation benchmark and in predicting cytokine-induced responses in human PBMCs. Together, these results establish Lingshu-Cell as a flexible cellular world model for in silico simulation of cell states and perturbation responses, laying the foundation for a new paradigm in biological discovery and perturbation screening.