건초더미 속의 바늘 찾기: 반사실적 교란을 통한 약지도 로그 인스턴스 이상 위치 파악

초록

로그 이상 탐지는 시스템 운영 및 보안 보장에 있어 중요한 과제이다. 그러나 대규모 네트워크 시스템에서는 로그 데이터가 방대한 양으로 생성되는 반면, 인스턴스 수준의 주석은 엄청난 비용이 소요되어 세밀한 이상 위치 파악에 큰 어려움이 따른다. 이러한 문제를 해결하기 위해, 본 논문에서는 배그 수준 레이블만을 사용하여 배그 수준 이상 탐지와 인스턴스 수준 이상 위치 파악을 모두 가능하게 하는 약지도 학습 프레임워크인 LogMILP(프로토타입 및 교란을 활용한 다중 인스턴스 학습 기반 로그 이상 위치 파악)를 제안한다. 본 방법은 반사실적 교란 일관성 정규화를 갖춘 프로토타입 기반 구조적 모델링을 통해 모델이 중요한 로그 항목을 정확히 식별하도록 유도함으로써, 거친 수준의 지도 학습 하에서 위치 파악 신뢰성과 해석 가능성을 향상시킨다. 세 가지 공개 데이터셋에 대한 실험 결과, LogMILP는 경쟁력 있는 탐지 성능을 달성하면서도 훨씬 더 신뢰할 수 있는 인스턴스 수준 위치 파악을 제공함을 보여준다. 본 코드는 https://github.com/YUK1207/LogMILP 에서 오픈소스로 공개된다.

English

Log anomaly detection is a critical task for system operations and security assurance. However, in networked systems at scale, log data are generated at massive scale while instance-level annotations are prohibitively expensive, posing great difficulties to fine-grained anomaly localization. To address this challenge, we propose LogMILP (Log anomaly localization based on Multi-Instance Learning enhanced by prototypes and Perturbation), a weakly supervised framework that enables both bag-level anomaly detection and instance-level anomaly localization using only bag-level labels. Our method guides the model to pinpoint the critical log entries using prototype-guided structural modeling with counterfactual perturbation consistency regularization, thereby improving localization reliability and interpretability under coarse-grained supervision. Experimental results on three public datasets demonstrate that LogMILP achieves competitive detection performance while yielding significantly more reliable instance-level localization. Our code is open-sourced at https://github.com/YUK1207/LogMILP.