단 하나의 정상 이미지로 다중 클래스 이상 탐지 학습하기 프롬프트

초록

자기 주의력(self-attention) 트랜스포머를 활용한 비지도 재구성 네트워크는 단일 모델로 다중 클래스(통합) 이상 탐지에서 최첨단 성능을 달성했습니다. 그러나 이러한 자기 주의력 재구성 모델은 주로 대상 특징에 작동하기 때문에, 문맥과의 높은 일관성으로 인해 정상 및 이상 특징 모두를 완벽하게 재구성할 수 있어 이상 탐지 실패로 이어질 수 있습니다. 또한, 이러한 모델은 낮은 공간 해상도의 잠재 공간에서 재구성을 수행하기 때문에 부정확한 이상 분할을 생성하는 경우가 많습니다. 재구성 모델이 높은 효율성을 유지하면서 통합 이상 탐지를 위한 일반화를 강화할 수 있도록, 우리는 단 하나의 정상 이미지 프롬프트(OneNIP)만으로 정상 특징을 재구성하고 이상 특징을 복원하는 간단하지만 효과적인 방법을 제안합니다. 기존 연구와 달리, OneNIP는 단 하나의 정상 이미지 프롬프트만으로 이상을 재구성하거나 복원할 수 있어 통합 이상 탐지 성능을 효과적으로 향상시킵니다. 더불어, 실제 정상 이미지와 합성된 이상 이미지를 모두 사용하여 재구성 오차를 회귀하는 지도형 정제기를 제안함으로써 픽셀 수준의 이상 분할을 크게 개선합니다. OneNIP는 MVTec, BTAD, VisA 등 세 가지 산업 이상 탐지 벤치마크에서 기존 방법들을 능가합니다. 코드와 사전 학습된 모델은 https://github.com/gaobb/OneNIP에서 확인할 수 있습니다.

English

Unsupervised reconstruction networks using self-attention transformers have achieved state-of-the-art performance for multi-class (unified) anomaly detection with a single model. However, these self-attention reconstruction models primarily operate on target features, which may result in perfect reconstruction for both normal and anomaly features due to high consistency with context, leading to failure in detecting anomalies. Additionally, these models often produce inaccurate anomaly segmentation due to performing reconstruction in a low spatial resolution latent space. To enable reconstruction models enjoying high efficiency while enhancing their generalization for unified anomaly detection, we propose a simple yet effective method that reconstructs normal features and restores anomaly features with just One Normal Image Prompt (OneNIP). In contrast to previous work, OneNIP allows for the first time to reconstruct or restore anomalies with just one normal image prompt, effectively boosting unified anomaly detection performance. Furthermore, we propose a supervised refiner that regresses reconstruction errors by using both real normal and synthesized anomalous images, which significantly improves pixel-level anomaly segmentation. OneNIP outperforms previous methods on three industry anomaly detection benchmarks: MVTec, BTAD, and VisA. The code and pre-trained models are available at https://github.com/gaobb/OneNIP.

단 하나의 정상 이미지로 다중 클래스 이상 탐지 학습하기 프롬프트

Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt

초록

Support