Chronos:學習時間序列的語言
Chronos: Learning the Language of Time Series
March 12, 2024
作者: Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang
cs.AI
摘要
我們介紹了Chronos,一個簡單而有效的預訓練概率時間序列模型框架。Chronos將時間序列值使用縮放和量化分為固定詞彙,並通過交叉熵損失在這些被標記的時間序列上訓練現有基於Transformer的語言模型架構。我們基於T5系列(參數範圍從20M到710M)在大量公開數據集上預訓練了Chronos模型,這些數據集還包括我們通過高斯過程生成的合成數據集,以提高泛化能力。在包含42個數據集的全面基準測試中,涵蓋了傳統的本地模型和深度學習方法,我們展示了Chronos模型:(a)在訓練語料庫中的數據集上明顯優於其他方法;以及(b)在新數據集上具有可比甚至優越的零樣本性能,相對於專門針對它們進行訓練的方法。我們的結果表明,Chronos模型可以利用來自不同領域的時間序列數據,以提高在未見的預測任務上的零樣本準確性,將預訓練模型定位為極大簡化預測管道的可行工具。
English
We introduce Chronos, a simple yet effective framework for pretrained
probabilistic time series models. Chronos tokenizes time series values using
scaling and quantization into a fixed vocabulary and trains existing
transformer-based language model architectures on these tokenized time series
via the cross-entropy loss. We pretrained Chronos models based on the T5 family
(ranging from 20M to 710M parameters) on a large collection of publicly
available datasets, complemented by a synthetic dataset that we generated via
Gaussian processes to improve generalization. In a comprehensive benchmark
consisting of 42 datasets, and comprising both classical local models and deep
learning methods, we show that Chronos models: (a) significantly outperform
other methods on datasets that were part of the training corpus; and (b) have
comparable and occasionally superior zero-shot performance on new datasets,
relative to methods that were trained specifically on them. Our results
demonstrate that Chronos models can leverage time series data from diverse
domains to improve zero-shot accuracy on unseen forecasting tasks, positioning
pretrained models as a viable tool to greatly simplify forecasting pipelines.Summary
AI-Generated Summary