5G無線網絡中基於推理語言模型的根因分析
Reasoning Language Models for Root Cause Analysis in 5G Wireless Networks
July 29, 2025
作者: Mohamed Sana, Nicola Piovesan, Antonio De Domenico, Yibin Kang, Haozhe Zhang, Merouane Debbah, Fadhel Ayed
cs.AI
摘要
在移动网络中,根本原因分析(RCA)仍是一项具有挑战性的任务,这主要源于对可解释性、领域专业知识及因果推理的需求。本研究提出了一种轻量级框架,该框架利用大型语言模型(LLMs)进行RCA。为此,我们引入了TeleLogs,一个精心策划的标注故障排除问题数据集,旨在为RCA能力提供基准测试。我们的评估显示,现有的开源推理型LLMs在处理这些问题时表现欠佳,凸显了领域特定适应的必要性。针对这一问题,我们提出了一种两阶段训练方法,该方法结合了监督微调与强化学习,以提升LLMs的准确性和推理质量。所提出的方法通过微调一系列RCA模型,整合领域知识并生成结构化的多步骤诊断解释,从而提高了可解释性和有效性。跨多个LLM规模的广泛实验表明,相较于最先进的推理与非推理模型,该方法在性能上取得了显著提升,包括对随机化测试变体的强大泛化能力。这些结果展示了领域适应、推理增强的LLMs在网络运营与管理中实现实用且可解释的RCA的潜力。
English
Root Cause Analysis (RCA) in mobile networks remains a challenging task due
to the need for interpretability, domain expertise, and causal reasoning. In
this work, we propose a lightweight framework that leverages Large Language
Models (LLMs) for RCA. To do so, we introduce TeleLogs, a curated dataset of
annotated troubleshooting problems designed to benchmark RCA capabilities. Our
evaluation reveals that existing open-source reasoning LLMs struggle with these
problems, underscoring the need for domain-specific adaptation. To address this
issue, we propose a two-stage training methodology that combines supervised
fine-tuning with reinforcement learning to improve the accuracy and reasoning
quality of LLMs. The proposed approach fine-tunes a series of RCA models to
integrate domain knowledge and generate structured, multi-step diagnostic
explanations, improving both interpretability and effectiveness. Extensive
experiments across multiple LLM sizes show significant performance gains over
state-of-the-art reasoning and non-reasoning models, including strong
generalization to randomized test variants. These results demonstrate the
promise of domain-adapted, reasoning-enhanced LLMs for practical and
explainable RCA in network operation and management.