時序數據插值的全局局部信息瓶頸
Glocal Information Bottleneck for Time Series Imputation
October 6, 2025
作者: Jie Yang, Kexin Zhang, Guibin Zhang, Philip S. Yu, Kaize Ding
cs.AI
摘要
時間序列插補(Time Series Imputation, TSI)旨在恢復時間數據中的缺失值,由於現實場景中缺失情況複雜且缺失率往往較高,這仍然是一個基礎性挑戰。現有模型通常優化逐點重建損失,專注於恢復數值(局部信息)。然而,我們觀察到在高缺失率下,這些模型在訓練階段表現良好,但在推理階段卻產生較差的插補結果和扭曲的潛在表示分佈(全局信息)。這揭示了一個關鍵的優化困境:當前目標缺乏全局指導,導致模型過度擬合局部噪聲,無法捕捉數據的全局信息。為解決這一問題,我們提出了一種新的訓練範式——全局局部信息瓶頸(Glocal Information Bottleneck, Glocal-IB)。Glocal-IB與模型無關,並通過引入全局對齊損失來擴展標準的IB框架,該損失源自於可處理的互信息近似。此損失將掩碼輸入的潛在表示與其原始觀測對應的潛在表示對齊,幫助模型在抑制缺失值引起的噪聲的同時,保留全局結構和局部細節,從而在高缺失率下實現更好的泛化能力。在九個數據集上的廣泛實驗證實,Glocal-IB在缺失情況下能持續提升性能並對齊潛在表示。我們的代碼實現可在https://github.com/Muyiiiii/NeurIPS-25-Glocal-IB獲取。
English
Time Series Imputation (TSI), which aims to recover missing values in
temporal data, remains a fundamental challenge due to the complex and often
high-rate missingness in real-world scenarios. Existing models typically
optimize the point-wise reconstruction loss, focusing on recovering numerical
values (local information). However, we observe that under high missing rates,
these models still perform well in the training phase yet produce poor
imputations and distorted latent representation distributions (global
information) in the inference phase. This reveals a critical optimization
dilemma: current objectives lack global guidance, leading models to overfit
local noise and fail to capture global information of the data. To address this
issue, we propose a new training paradigm, Glocal Information Bottleneck
(Glocal-IB). Glocal-IB is model-agnostic and extends the standard IB framework
by introducing a Global Alignment loss, derived from a tractable mutual
information approximation. This loss aligns the latent representations of
masked inputs with those of their originally observed counterparts. It helps
the model retain global structure and local details while suppressing noise
caused by missing values, giving rise to better generalization under high
missingness. Extensive experiments on nine datasets confirm that Glocal-IB
leads to consistently improved performance and aligned latent representations
under missingness. Our code implementation is available in
https://github.com/Muyiiiii/NeurIPS-25-Glocal-IB.