僅用一張正常圖像學習檢測多類異常
Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt
May 14, 2025
作者: Bin-Bin Gao
cs.AI
摘要
基於自注意力機制的無監督重建網絡在單一模型的多類別(統一)異常檢測中已達到了最先進的性能。然而,這些自注意力重建模型主要針對目標特徵進行操作,這可能導致對正常和異常特徵的完美重建,由於與上下文的高度一致性,從而導致異常檢測失敗。此外,這些模型由於在低空間分辨率的潛在空間中進行重建,往往會產生不準確的異常分割。為了使重建模型在保持高效率的同時增強其對統一異常檢測的泛化能力,我們提出了一種簡單而有效的方法,即僅使用一張正常圖像提示(OneNIP)來重建正常特徵並恢復異常特徵。與之前的工作相比,OneNIP首次實現了僅需一張正常圖像提示即可重建或恢復異常,有效提升了統一異常檢測的性能。此外,我們提出了一種監督式精煉器,通過使用真實的正常圖像和合成的異常圖像來迴歸重建誤差,顯著改善了像素級的異常分割。OneNIP在三個工業異常檢測基準測試(MVTec、BTAD和VisA)上均超越了先前的方法。代碼和預訓練模型可在https://github.com/gaobb/OneNIP獲取。
English
Unsupervised reconstruction networks using self-attention transformers have
achieved state-of-the-art performance for multi-class (unified) anomaly
detection with a single model. However, these self-attention reconstruction
models primarily operate on target features, which may result in perfect
reconstruction for both normal and anomaly features due to high consistency
with context, leading to failure in detecting anomalies. Additionally, these
models often produce inaccurate anomaly segmentation due to performing
reconstruction in a low spatial resolution latent space. To enable
reconstruction models enjoying high efficiency while enhancing their
generalization for unified anomaly detection, we propose a simple yet effective
method that reconstructs normal features and restores anomaly features with
just One Normal Image Prompt (OneNIP). In contrast to previous work, OneNIP
allows for the first time to reconstruct or restore anomalies with just one
normal image prompt, effectively boosting unified anomaly detection
performance. Furthermore, we propose a supervised refiner that regresses
reconstruction errors by using both real normal and synthesized anomalous
images, which significantly improves pixel-level anomaly segmentation. OneNIP
outperforms previous methods on three industry anomaly detection benchmarks:
MVTec, BTAD, and VisA. The code and pre-trained models are available at
https://github.com/gaobb/OneNIP.Summary
AI-Generated Summary