CGM-JEPA：通过预测性自监督预训练学习一致的连续血糖监测表示

摘要

持续葡萄糖监测（CGM）可检测早期代谢亚表型（胰岛素抵抗、β细胞功能障碍），但大规模人群部署面临两个耦合问题。首先，同一生理状态可通过多种视图呈现（CGM时间序列、静脉OGTT、葡萄糖密度摘要），当部署场景改变模态或环境时，单视图表征将无法迁移。其次，基线方法在这些场景变化下表现不一致。这两个问题指向同一解决方案：构建能脱离单一视图、捕获更高层次时间与分布结构的表征。我们提出CGM-JEPA，一种自监督预训练框架，通过预测掩码潜在表征而非原始数值，实现跨模态迁移的表征抽象。X-CGM-JEPA进一步引入掩码葡萄糖密度跨视图目标，以获取互补的分布信息。我们在228名受试者的sim389k未标注CGM数据上预训练，并在两个临床队列（N=27与N=17公共发布子集）上，通过20次迭代×2折交叉验证评估三种场景（队列泛化、静脉转CGM迁移、家庭CGM）。X-CGM-JEPA在所有三种场景中针对两个结局指标的AUROC均位列第一或第二，而其他基线均未实现，相比最强基线在队列泛化中提升+6.5个百分点，在静脉转CGM迁移中提升+3.6个百分点（配对Wilcoxon检验，p<0.001）。在模态迁移下，该模型在保持平均AUROC的同时，将性能向弱势亚组重新分配（种族AUROC差距缩小25-54%）；在稀疏的域内静脉数据上，分布视图提升了标签感知聚类性能（ARI +39%，NMI +40%）。代码与权重：https://github.com/cruiseresearchgroup/CGM-JEPA

English

Continuous Glucose Monitoring (CGM) can detect early metabolic subphenotypes (insulin resistance, IR; β-cell dysfunction), but population-scale deployment faces two coupled problems. First, the same physiological state appears through multiple views (CGM time series, venous OGTT, Glucodensity summaries), so single-view representations fail to transfer when deployment shifts the modality or setting. Second, baselines perform inconsistently across these shifts. Both problems point to one remedy: representations that abstract away from any single view to capture higher-level temporal and distributional structure. We propose CGM-JEPA, a self-supervised pretraining framework which predicts masked latent representations rather than raw values, yielding abstraction that transfers across modalities. X-CGM-JEPA adds a masked Glucodensity cross-view objective for complementary distributional information. We pretrain on sim389k unlabeled CGM readings from 228 subjects and evaluate on two clinical cohorts (N=27 and N=17 public-release subsets) across three regimes (cohort generalization, venous-to-CGM transfer, home CGM) under 20-iteration times 2-fold cross-validation. X-CGM-JEPA ranks first or second on AUROC for both endpoints across all three regimes while no baseline does, exceeding the strongest baseline by up to +6.5 pp in cohort generalization and +3.6 pp in venous-to-CGM transfer (paired Wilcoxon, p<0.001). Under modality shift, it matches mean AUROC while redistributing toward weaker subgroups (ethnicity AUROC gap shrinks 25-54%); on sparse in-domain venous data, the distributional view lifts label-aware clustering (ARI +39%, NMI +40%). Code and weights: https://github.com/cruiseresearchgroup/CGM-JEPA

CGM-JEPA：通过预测性自监督预训练学习一致的连续血糖监测表示

CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

摘要

Support