单细胞CRISPR扰动几何一致性揭示调控结构并预测细胞应激反应
Geometric coherence of single-cell CRISPR perturbations reveals regulatory architecture and predicts cellular stress
April 17, 2026
作者: Prashant C. Raju
cs.AI
摘要
基因组工程已实现显著的序列层面精准调控,但预测细胞在扰动后将占据的转录组状态仍是一个悬而未决的难题。单细胞CRISPR筛选技术可测量细胞偏离未扰动状态的程度,然而这种效应强度指标忽略了一个根本问题:细胞是否协同运动?当两种扰动具有相同强度时,若其中一种驱动细胞沿共享轨迹协同移动,而另一种使细胞在表达空间中分散,则可能产生质的不同结果。我们提出了一种几何稳定性度量指标Shesha,通过计算单个细胞位移向量与平均扰动方向之间的平均余弦相似度,来量化单细胞扰动响应的方向一致性。在五个CRISPR数据集(涵盖CRISPRa、CRISPRi和混合筛选的2,200余种扰动)中,稳定性与效应强度呈强相关(Spearman ρ=0.75-0.97),经校准的跨数据集相关性达0.97。关键在于,当两种指标解耦时出现的不一致案例揭示了调控架构:如CEBPA和GATA1等多效性主调控因子需支付"几何代价",产生强度大但无序的位移;而如KLF1等谱系特异性因子则产生高度协调的响应。在控制强度变量后,几何不稳定性与分子伴侣激活水平升高独立相关(HSPA5/BiP;跨数据集偏相关系数ρ_{partial}=-0.34和-0.21),且高稳定性/高应激象限出现系统性耗竭。这种强度-稳定性关系在scGPT基础模型嵌入中持续存在,证实其是生物状态空间的固有特性而非线性投影产物。扰动稳定性为筛选中的靶点优先排序、细胞制造中的表型质量控制、以及计算机扰动预测评估提供了互补性分析维度。
English
Genome engineering has achieved remarkable sequence-level precision, yet predicting the transcriptomic state that a cell will occupy after perturbation remains an open problem. Single-cell CRISPR screens measure how far cells move from their unperturbed state, but this effect magnitude ignores a fundamental question: do the cells move together? Two perturbations with identical magnitude can produce qualitatively different outcomes if one drives cells coherently along a shared trajectory while the other scatters them across expression space. We introduce a geometric stability metric, Shesha, that quantifies the directional coherence of single-cell perturbation responses as the mean cosine similarity between individual cell shift vectors and the mean perturbation direction. Across five CRISPR datasets (2,200+ perturbations spanning CRISPRa, CRISPRi, and pooled screens), stability correlates strongly with effect magnitude (Spearman ρ=0.75-0.97), with a calibrated cross-dataset correlation of 0.97. Crucially, discordant cases where the two metrics decouple expose regulatory architecture: pleiotropic master regulators such as CEBPA and GATA1 pay a "geometric tax," producing large but incoherent shifts, while lineage-specific factors such as KLF1 produce tightly coordinated responses. After controlling for magnitude, geometric instability is independently associated with elevated chaperone activation (HSPA5/BiP; ρ_{partial}=-0.34 and -0.21 across datasets), and the high-stability/high-stress quadrant is systematically depleted. The magnitude-stability relationship persists in scGPT foundation model embeddings, confirming it is a property of biological state space rather than linear projection. Perturbation stability provides a complementary axis for hit prioritization in screens, phenotypic quality control in cell manufacturing, and evaluation of in silico perturbation predictions.