SenTSR-基准：基于知识注入的时间序列推理思维方法

摘要

时间序列诊断推理在众多应用中至关重要，但现有解决方案始终存在一个显著缺陷：通用推理大语言模型（GRLM）虽具备强大的推理能力，却缺乏理解复杂时间序列模式的领域知识；而经过微调的时间序列大语言模型（TSLM）虽能识别这些模式，却难以对更复杂问题实现泛化推理。为弥补这一鸿沟，我们提出一种混合知识注入框架，将TSLM生成的领域洞察直接注入GRLM的推理轨迹，从而借助领域知识实现强效的时间序列推理。由于收集知识注入微调所需数据成本高昂，我们进一步采用基于可验证奖励的强化学习方法（RLVR），在无需人工监督的情况下生成知识密集的推理轨迹，并将此类领域思维轨迹迁移至GRLM以实现高效知识注入。此外，我们发布了SenTSR-Bench——一个基于真实工业场景采集的多变量时间序列诊断推理基准测试。在SenTSR-Bench及其他公共数据集上的实验表明，本方法相较TSLM模型持续提升9.1%-26.1%，较GRLM模型提升7.9%-22.4%，能够提供稳健且具有上下文感知能力的时间序列诊断洞察。

English

Time-series diagnostic reasoning is essential for many applications, yet existing solutions face a persistent gap: general reasoning large language models (GRLMs) possess strong reasoning skills but lack the domain-specific knowledge to understand complex time-series patterns. Conversely, fine-tuned time-series LLMs (TSLMs) understand these patterns but lack the capacity to generalize reasoning for more complicated questions. To bridge this gap, we propose a hybrid knowledge-injection framework that injects TSLM-generated insights directly into GRLM's reasoning trace, thereby achieving strong time-series reasoning with in-domain knowledge. As collecting data for knowledge injection fine-tuning is costly, we further leverage a reinforcement learning-based approach with verifiable rewards (RLVR) to elicit knowledge-rich traces without human supervision, then transfer such an in-domain thinking trace into GRLM for efficient knowledge injection. We further release SenTSR-Bench, a multivariate time-series-based diagnostic reasoning benchmark collected from real-world industrial operations. Across SenTSR-Bench and other public datasets, our method consistently surpasses TSLMs by 9.1%-26.1% and GRLMs by 7.9%-22.4%, delivering robust, context-aware time-series diagnostic insights.