CGM-JEPA：透過預測性自監督預訓練學習一致的連續血糖監測表徵

摘要

連續血糖監測（CGM）能偵測早期代謝亞表型（胰島素阻抗、β細胞功能障礙），但大規模部署面臨兩項相互關聯的問題。首先，同一生理狀態會透過多重視角呈現（CGM時間序列、靜脈OGTT、Glucodensity摘要），當部署改變模態或場景時，單視角表徵無法有效轉移。其次，基準模型在這些轉移中的表現不一致。兩項問題均指向同一解決方案：建立能脫離單一視角、捕捉更高層級時間與分佈結構的表徵。我們提出CGM-JEPA，一個自監督預訓練框架，其預測目標為遮罩的潛在表徵而非原始數值，從而產生能跨模態轉移的抽象表徵。X-CGM-JEPA則加入遮罩Glucodensity跨視角目標，以獲取互補的分佈資訊。我們在源自228名受試者的sim389k無標籤CGM讀數上進行預訓練，並在兩個臨床隊列（分別為N=27與N=17的公開子集）上，於三種場景（隊列泛化、靜脈轉CGM、居家CGM）下進行20次迭代×2折交叉驗證評估。X-CGM-JEPA在三個場景中針對兩個終點指標的AUROC均排名第一或第二，且無其他基準模型能達成此表現，在隊列泛化中超越最強基準模型達+6.5個百分點，在靜脈轉CGM中超越+3.6個百分點（配對Wilcoxon檢定，p<0.001）。在模態轉移下，其平均AUROC與基準模型相當，但將表現重新分配至較弱亞群（種族AUROC差距縮小25-54%）；在稀疏的域內靜脈數據上，分佈視角提升標籤感知聚類效能（ARI +39%，NMI +40%）。程式碼與權重：https://github.com/cruiseresearchgroup/CGM-JEPA

English

Continuous Glucose Monitoring (CGM) can detect early metabolic subphenotypes (insulin resistance, IR; β-cell dysfunction), but population-scale deployment faces two coupled problems. First, the same physiological state appears through multiple views (CGM time series, venous OGTT, Glucodensity summaries), so single-view representations fail to transfer when deployment shifts the modality or setting. Second, baselines perform inconsistently across these shifts. Both problems point to one remedy: representations that abstract away from any single view to capture higher-level temporal and distributional structure. We propose CGM-JEPA, a self-supervised pretraining framework which predicts masked latent representations rather than raw values, yielding abstraction that transfers across modalities. X-CGM-JEPA adds a masked Glucodensity cross-view objective for complementary distributional information. We pretrain on sim389k unlabeled CGM readings from 228 subjects and evaluate on two clinical cohorts (N=27 and N=17 public-release subsets) across three regimes (cohort generalization, venous-to-CGM transfer, home CGM) under 20-iteration times 2-fold cross-validation. X-CGM-JEPA ranks first or second on AUROC for both endpoints across all three regimes while no baseline does, exceeding the strongest baseline by up to +6.5 pp in cohort generalization and +3.6 pp in venous-to-CGM transfer (paired Wilcoxon, p<0.001). Under modality shift, it matches mean AUROC while redistributing toward weaker subgroups (ethnicity AUROC gap shrinks 25-54%); on sparse in-domain venous data, the distributional view lifts label-aware clustering (ARI +39%, NMI +40%). Code and weights: https://github.com/cruiseresearchgroup/CGM-JEPA

CGM-JEPA：透過預測性自監督預訓練學習一致的連續血糖監測表徵

CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

摘要

Support