ChatPaper.aiChatPaper

通过上下文学习评估时间序列基础模型的迁移能力

Estimating Time Series Foundation Model Transferability via In-Context Learning

September 28, 2025
作者: Qingren Yao, Ming Jin, Chengqi Zhang, Chao-Han Huck Yang, Jun Qi, Shirui Pan
cs.AI

摘要

时序基础模型(TSFMs)通过大规模预训练实现了强大的零样本预测能力,然而在公开数据有限的领域中,微调对于提升性能仍至关重要。随着TSFMs数量的增加,高效识别最适合下游微调的模型变得愈发困难。本研究中,我们提出了TimeTic,一种将模型选择重构为上下文学习问题的可迁移性评估框架:基于已知(源)数据集上的观测,它预测TSFM在下游(目标)数据集微调后的表现。TimeTic灵活地将观测到的模型-数据关系组织为上下文信息,使其能无缝适应各种测试场景。利用由数据集元特征、模型特性及微调性能形成的自然表格结构,我们采用表格基础模型作为上下文学习器。此外,我们引入了一种基于模型层间熵演变的新型模型表征方法,捕捉嵌入空间的差异,使TimeTic能够泛化至任意模型集。我们建立了一个全面的可迁移性评估基准,包含10个数据集、10个基础模型及3种预测任务。在此基准上,TimeTic的评估结果与未见数据集的实际微调性能高度一致,平均秩相关系数约为0.6,相较于使用零样本性能作为可迁移性评分,提升了30%。
English
Time series foundation models (TSFMs) offer strong zero-shot forecasting via large-scale pre-training, yet fine-tuning remains critical for boosting performance in domains with limited public data. With the growing number of TSFMs, efficiently identifying the best model for downstream fine-tuning becomes increasingly challenging. In this work, we introduce TimeTic, a transferability estimation framework that recasts model selection as an in-context-learning problem: given observations on known (source) datasets, it predicts how a TSFM will perform after fine-tuning on a downstream (target) dataset. TimeTic flexibly organizes the observed model-data relationships as contextual information, allowing it to adapt seamlessly to various test-time scenarios. Leveraging the natural tabular structure formed by dataset meta-features, model characteristics, and fine-tuned performance, we employ tabular foundation models to serve as in-context learners. We further introduce a novel model characterization based on entropy evolution across model layers, capturing embedding-space distinctions and enabling TimeTic to generalize across arbitrary model sets. We establish a comprehensive benchmark for transferability estimation including 10 datasets, 10 foundation models, and 3 forecasting tasks. On this benchmark, TimeTic's estimation demonstrates strong alignment with actual fine-tuned performance for previously unseen datasets, achieving a mean rank correlation of approximately 0.6 and a 30% improvement compared to using zero-shot performance as the transferability score.
PDF11October 1, 2025