시계열 데이터 보간을 위한 글로컬 정보 병목

초록

시계열 데이터에서 누락된 값을 복원하는 것을 목표로 하는 시계열 대체(Time Series Imputation, TSI)는 현실 세계 시나리오에서 발생하는 복잡하고 높은 비율의 누락으로 인해 여전히 근본적인 과제로 남아 있습니다. 기존 모델들은 일반적으로 수치적 값(지역 정보)을 복원하는 데 초점을 맞춰 점별 재구성 손실을 최적화합니다. 그러나 우리는 높은 누락률에서 이러한 모델들이 훈련 단계에서는 여전히 잘 수행되지만, 추론 단계에서는 부정확한 대체 값과 왜곡된 잠재 표현 분포(전역 정보)를 생성한다는 것을 관찰했습니다. 이는 현재의 목표 함수가 전역적 지침을 제공하지 않아 모델이 지역적 노이즈에 과적합되고 데이터의 전역 정보를 포착하지 못하는 중요한 최적화 딜레마를 드러냅니다. 이 문제를 해결하기 위해, 우리는 새로운 훈련 패러다임인 Glocal Information Bottleneck(Glocal-IB)을 제안합니다. Glocal-IB는 모델에 독립적이며, 표준 IB 프레임워크를 확장하여 계산 가능한 상호 정보 근사에서 도출된 Global Alignment 손실을 도입합니다. 이 손실은 마스킹된 입력의 잠재 표현을 원래 관측된 입력의 잠재 표현과 정렬합니다. 이를 통해 모델이 누락된 값으로 인한 노이즈를 억제하면서 전역 구조와 지역 세부 사항을 유지할 수 있게 하여, 높은 누락률에서도 더 나은 일반화를 가능하게 합니다. 9개의 데이터셋에 대한 광범위한 실험을 통해 Glocal-IB가 누락 상황에서 일관되게 향상된 성능과 정렬된 잠재 표현을 제공한다는 것을 확인했습니다. 우리의 코드 구현은 https://github.com/Muyiiiii/NeurIPS-25-Glocal-IB에서 확인할 수 있습니다.

English

Time Series Imputation (TSI), which aims to recover missing values in temporal data, remains a fundamental challenge due to the complex and often high-rate missingness in real-world scenarios. Existing models typically optimize the point-wise reconstruction loss, focusing on recovering numerical values (local information). However, we observe that under high missing rates, these models still perform well in the training phase yet produce poor imputations and distorted latent representation distributions (global information) in the inference phase. This reveals a critical optimization dilemma: current objectives lack global guidance, leading models to overfit local noise and fail to capture global information of the data. To address this issue, we propose a new training paradigm, Glocal Information Bottleneck (Glocal-IB). Glocal-IB is model-agnostic and extends the standard IB framework by introducing a Global Alignment loss, derived from a tractable mutual information approximation. This loss aligns the latent representations of masked inputs with those of their originally observed counterparts. It helps the model retain global structure and local details while suppressing noise caused by missing values, giving rise to better generalization under high missingness. Extensive experiments on nine datasets confirm that Glocal-IB leads to consistently improved performance and aligned latent representations under missingness. Our code implementation is available in https://github.com/Muyiiiii/NeurIPS-25-Glocal-IB.

시계열 데이터 보간을 위한 글로컬 정보 병목

Glocal Information Bottleneck for Time Series Imputation

초록

Support