干し草の山の中の針を見つける：反事実摂動による弱教師ありログインスタンス異常位置特定に向けて

要旨

ログ異常検出は、システム運用とセキュリティ保証にとって重要なタスクである。しかし、大規模なネットワークシステムでは、ログデータが膨大に生成される一方で、インスタンスレベルのアノテーションには非常にコストがかかるため、きめ細かな異常位置特定には大きな困難が伴う。この課題に対処するため、我々はLogMILP（プロトタイプと摂動によって強化されたマルチインスタンス学習に基づくログ異常位置特定）を提案する。これは、バッグレベルのラベルのみを用いて、バッグレベルの異常検出とインスタンスレベルの異常位置特定の両方を可能にする弱教師ありフレームワークである。本手法は、反実摂動一貫性正則化を伴うプロトタイプ誘導型構造モデリングによって、モデルが重要なログエントリを正確に特定するよう導き、粗粒度の監督下での位置特定の信頼性と解釈可能性を向上させる。3つの公開データセットにおける実験結果は、LogMILPが競争力のある検出性能を達成すると同時に、インスタンスレベルの位置特定において著しく高い信頼性を提供することを示している。コードはhttps://github.com/YUK1207/LogMILPで公開されている。

English

Log anomaly detection is a critical task for system operations and security assurance. However, in networked systems at scale, log data are generated at massive scale while instance-level annotations are prohibitively expensive, posing great difficulties to fine-grained anomaly localization. To address this challenge, we propose LogMILP (Log anomaly localization based on Multi-Instance Learning enhanced by prototypes and Perturbation), a weakly supervised framework that enables both bag-level anomaly detection and instance-level anomaly localization using only bag-level labels. Our method guides the model to pinpoint the critical log entries using prototype-guided structural modeling with counterfactual perturbation consistency regularization, thereby improving localization reliability and interpretability under coarse-grained supervision. Experimental results on three public datasets demonstrate that LogMILP achieves competitive detection performance while yielding significantly more reliable instance-level localization. Our code is open-sourced at https://github.com/YUK1207/LogMILP.