LLaTiSA：視覚的知覚から意味論へ向けた難易度階層化時系列推論の実現を目指して

要旨

時系列データの包括的理解は、大規模言語モデル(LLM)における重要な課題として残されている。現在の研究は、断片化したタスク定義と本質的な曖昧性を伴うベンチマークによって妨げられており、厳密な評価と統一的な時系列推論モデル(TSRM)の開発を阻んでいる。この課題を解決するため、我々は認知的な複雑さが段階的に増加する4段階の分類法によって時系列推論(TSR)を体系化した。さらに、多様なタスク組み合わせと検証済みChain-of-Thought(CoT)軌跡を含む8万3千サンプルから構成される階層的時系列推論データセットHiTSRを導入した。HiTSRを活用し、視覚言語モデル(VLM)の時間的知覚を強化するために、視覚化されたパターンと精度較正された数値テーブルを統合した強力なTSRMであるLLaTiSAを提案する。多段階カリキュラム学習戦略を通じて、LLaTiSAは優れた性能を達成し、多様なTSRタスクと実世界シナリオにおいて頑健な分布外汎化性能を示した。実装コードはhttps://github.com/RainingNovember/LLaTiSAで公開されている。

English

Comprehensive understanding of time series remains a significant challenge for Large Language Models (LLMs). Current research is hindered by fragmented task definitions and benchmarks with inherent ambiguities, precluding rigorous evaluation and the development of unified Time Series Reasoning Models(TSRMs). To bridge this gap, we formalize Time Series Reasoning (TSR) via a four-level taxonomy of increasing cognitive complexity. We introduce HiTSR, a hierarchical time series reasoning dataset comprising 83k samples with diverse task combinations and verified Chain-of-Thought (CoT) trajectories. Leveraging HiTSR, we propose LLaTiSA, a strong TSRM that integrates visualized patterns with precision-calibrated numerical tables to enhance the temporal perception of Vision-Language Models (VLMs). Through a multi-stage curriculum fine-tuning strategy, LLaTiSA achieves superior performance and exhibits robust out-of-distribution generalization across diverse TSR tasks and real-world scenarios. Our code is available at https://github.com/RainingNovember/LLaTiSA.

LLaTiSA：視覚的知覚から意味論へ向けた難易度階層化時系列推論の実現を目指して

LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics

要旨

Support