MetaUAS:基于单提示元学习的通用异常分割
MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning
May 14, 2025
作者: Bin-Bin Gao
cs.AI
摘要
零样本和少样本视觉异常分割依赖于强大的视觉-语言模型,这些模型通过手动设计的文本提示来检测未见过的异常。然而,视觉表示本质上独立于语言。本文中,我们探索了纯视觉基础模型作为广泛使用的视觉-语言模型替代方案,用于通用视觉异常分割的潜力。我们提出了一种新颖的范式,将异常分割统一为变化分割。这一范式使我们能够利用从现有图像数据集中衍生的大规模合成图像对,这些图像对包含对象级别和局部区域的变化,且独立于目标异常数据集。我们提出了一种用于通用异常分割的单提示元学习框架(MetaUAS),该框架在此合成数据集上训练,随后能够很好地泛化以分割现实世界中的任何新颖或未见过的视觉异常。为了处理提示图像与查询图像之间的几何变化,我们提出了一个软特征对齐模块,该模块桥接了成对图像的变化感知与单图像语义分割。这是首个不依赖特殊异常检测数据集和预训练视觉-语言模型,仅使用纯视觉模型实现通用异常分割的工作。我们的方法仅需一张正常图像提示即可高效地分割任何异常,且无需语言指导即可实现无训练。我们的MetaUAS显著优于以往的零样本、少样本甚至全样本异常分割方法。代码和预训练模型可在https://github.com/gaobb/MetaUAS获取。
English
Zero- and few-shot visual anomaly segmentation relies on powerful
vision-language models that detect unseen anomalies using manually designed
textual prompts. However, visual representations are inherently independent of
language. In this paper, we explore the potential of a pure visual foundation
model as an alternative to widely used vision-language models for universal
visual anomaly segmentation. We present a novel paradigm that unifies anomaly
segmentation into change segmentation. This paradigm enables us to leverage
large-scale synthetic image pairs, featuring object-level and local region
changes, derived from existing image datasets, which are independent of target
anomaly datasets. We propose a one-prompt Meta-learning framework for Universal
Anomaly Segmentation (MetaUAS) that is trained on this synthetic dataset and
then generalizes well to segment any novel or unseen visual anomalies in the
real world. To handle geometrical variations between prompt and query images,
we propose a soft feature alignment module that bridges paired-image change
perception and single-image semantic segmentation. This is the first work to
achieve universal anomaly segmentation using a pure vision model without
relying on special anomaly detection datasets and pre-trained visual-language
models. Our method effectively and efficiently segments any anomalies with only
one normal image prompt and enjoys training-free without guidance from
language. Our MetaUAS significantly outperforms previous zero-shot, few-shot,
and even full-shot anomaly segmentation methods. The code and pre-trained
models are available at https://github.com/gaobb/MetaUAS.Summary
AI-Generated Summary