仅凭一张正常图像学习检测多类别异常
Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt
May 14, 2025
作者: Bin-Bin Gao
cs.AI
摘要
基于自注意力变换器的无监督重建网络在单一模型下实现了多类别(统一)异常检测的最先进性能。然而,这些自注意力重建模型主要针对目标特征进行操作,由于与上下文高度一致,可能导致对正常和异常特征的完美重建,从而在异常检测上失效。此外,这些模型通常在低空间分辨率的潜在空间中进行重建,导致异常分割不准确。为了在保持重建模型高效性的同时增强其对统一异常检测的泛化能力,我们提出了一种简单而有效的方法,仅需一张正常图像提示(OneNIP)即可重建正常特征并恢复异常特征。与以往工作不同,OneNIP首次实现了仅凭一张正常图像提示就能重建或恢复异常,显著提升了统一异常检测的性能。此外,我们提出了一种监督式精炼器,通过使用真实的正常图像和合成的异常图像来回归重建误差,极大地改善了像素级异常分割。OneNIP在三个工业异常检测基准测试(MVTec、BTAD和VisA)上均超越了先前的方法。代码和预训练模型可在https://github.com/gaobb/OneNIP获取。
English
Unsupervised reconstruction networks using self-attention transformers have
achieved state-of-the-art performance for multi-class (unified) anomaly
detection with a single model. However, these self-attention reconstruction
models primarily operate on target features, which may result in perfect
reconstruction for both normal and anomaly features due to high consistency
with context, leading to failure in detecting anomalies. Additionally, these
models often produce inaccurate anomaly segmentation due to performing
reconstruction in a low spatial resolution latent space. To enable
reconstruction models enjoying high efficiency while enhancing their
generalization for unified anomaly detection, we propose a simple yet effective
method that reconstructs normal features and restores anomaly features with
just One Normal Image Prompt (OneNIP). In contrast to previous work, OneNIP
allows for the first time to reconstruct or restore anomalies with just one
normal image prompt, effectively boosting unified anomaly detection
performance. Furthermore, we propose a supervised refiner that regresses
reconstruction errors by using both real normal and synthesized anomalous
images, which significantly improves pixel-level anomaly segmentation. OneNIP
outperforms previous methods on three industry anomaly detection benchmarks:
MVTec, BTAD, and VisA. The code and pre-trained models are available at
https://github.com/gaobb/OneNIP.Summary
AI-Generated Summary