SenTSR-Bench：知識注入による時系列推論

要旨

時系列診断推論は多くの応用分野で不可欠であるが、既存の手法には決定的な課題が存在する。汎用推論大規模言語モデル（GRLM）は強力な推論能力を有するものの、複雑な時系列パターンを理解するための分野特化知識を欠いている。一方、ファインチューニングされた時系列LLM（TSLM）はこれらのパターンを理解できるが、より複雑な問題に対する汎化的な推論能力が不足している。この課題を解決するため、我々はTSLMが生成した知見をGRLMの推論過程に直接注入するハイブリッド知識注入フレームワークを提案する。これにより、ドメイン知識を備えた強力な時系列推論を実現する。知識注入のためのファインチューニングデータ収集は高コストであるため、検証可能な報酬に基づく強化学習（RLVR）手法を活用し、人的監督なしで知識豊富な推論過程を生成し、そのドメイン特化的思考過程をGRLMに転移することで効率的な知識注入を実現する。さらに、実世界の産業オペレーションから収集した多変量時系列ベースの診断推論ベンチマーク「SenTSR-Bench」を公開する。SenTSR-Benchおよび他の公開データセットにおける評価では、本手法はTSLMを9.1%-26.1%、GRLMを7.9%-22.4%上回り、頑健で文脈を考慮した時系列診断知見を提供する。

English

Time-series diagnostic reasoning is essential for many applications, yet existing solutions face a persistent gap: general reasoning large language models (GRLMs) possess strong reasoning skills but lack the domain-specific knowledge to understand complex time-series patterns. Conversely, fine-tuned time-series LLMs (TSLMs) understand these patterns but lack the capacity to generalize reasoning for more complicated questions. To bridge this gap, we propose a hybrid knowledge-injection framework that injects TSLM-generated insights directly into GRLM's reasoning trace, thereby achieving strong time-series reasoning with in-domain knowledge. As collecting data for knowledge injection fine-tuning is costly, we further leverage a reinforcement learning-based approach with verifiable rewards (RLVR) to elicit knowledge-rich traces without human supervision, then transfer such an in-domain thinking trace into GRLM for efficient knowledge injection. We further release SenTSR-Bench, a multivariate time-series-based diagnostic reasoning benchmark collected from real-world industrial operations. Across SenTSR-Bench and other public datasets, our method consistently surpasses TSLMs by 9.1%-26.1% and GRLMs by 7.9%-22.4%, delivering robust, context-aware time-series diagnostic insights.

SenTSR-Bench：知識注入による時系列推論

SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning

要旨

Support