时间序列科学家:面向时间序列分析的通用型AI代理
TimeSeriesScientist: A General-Purpose AI Agent for Time Series Analysis
October 2, 2025
作者: Haokun Zhao, Xiang Zhang, Jiaqi Wei, Yiwei Xu, Yuting He, Siqi Sun, Chenyu You
cs.AI
摘要
时间序列预测在能源、金融、气候和公共卫生等多个领域的决策中占据核心地位。实践中,预测者面临成千上万条短而嘈杂的序列,这些序列在频率、质量和预测周期上各不相同,其中主要成本不在于模型拟合,而在于为获得可靠预测所需的劳动密集型预处理、验证和集成工作。现有的统计和深度学习模型通常针对特定数据集或领域定制,泛化能力较差。因此,亟需一种通用的、领域无关的框架,以最大限度地减少人为干预。本文介绍了TimeSeriesScientist(TSci),这是首个基于大语言模型(LLM)驱动的通用时间序列预测框架。该框架包含四个专门化的智能体:Curator通过LLM引导的诊断,结合外部工具对数据统计进行推理,以选择有针对性的预处理方法;Planner利用多模态诊断和自我规划,缩小模型选择的假设空间;Forecaster执行模型拟合与验证,并根据结果自适应地选择最佳模型配置及集成策略,以生成最终预测;Reporter则将整个过程综合成一份全面、透明的报告。通过透明的自然语言推理和详尽的报告,TSci将预测工作流转化为一个可解释且可跨任务扩展的白盒系统。在八个公认基准测试上的实证结果表明,TSci在统计模型和基于LLM的基线模型上均表现优异,平均分别减少了10.4%和38.2%的预测误差。此外,TSci生成的清晰严谨的报告,使得预测工作流更加透明和易于理解。
English
Time series forecasting is central to decision-making in domains as diverse
as energy, finance, climate, and public health. In practice, forecasters face
thousands of short, noisy series that vary in frequency, quality, and horizon,
where the dominant cost lies not in model fitting, but in the labor-intensive
preprocessing, validation, and ensembling required to obtain reliable
predictions. Prevailing statistical and deep learning models are tailored to
specific datasets or domains and generalize poorly. A general, domain-agnostic
framework that minimizes human intervention is urgently in demand. In this
paper, we introduce TimeSeriesScientist (TSci), the first LLM-driven agentic
framework for general time series forecasting. The framework comprises four
specialized agents: Curator performs LLM-guided diagnostics augmented by
external tools that reason over data statistics to choose targeted
preprocessing; Planner narrows the hypothesis space of model choice by
leveraging multi-modal diagnostics and self-planning over the input; Forecaster
performs model fitting and validation and, based on the results, adaptively
selects the best model configuration as well as ensemble strategy to make final
predictions; and Reporter synthesizes the whole process into a comprehensive,
transparent report. With transparent natural-language rationales and
comprehensive reports, TSci transforms the forecasting workflow into a
white-box system that is both interpretable and extensible across tasks.
Empirical results on eight established benchmarks demonstrate that TSci
consistently outperforms both statistical and LLM-based baselines, reducing
forecast error by an average of 10.4% and 38.2%, respectively. Moreover, TSci
produces a clear and rigorous report that makes the forecasting workflow more
transparent and interpretable.