ChatPaper.aiChatPaper

MetaUAS:基于单提示元学习的通用异常分割

MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning

May 14, 2025
作者: Bin-Bin Gao
cs.AI

摘要

零样本和少样本视觉异常分割依赖于强大的视觉-语言模型,这些模型通过手动设计的文本提示来检测未见过的异常。然而,视觉表示本质上独立于语言。本文中,我们探索了纯视觉基础模型作为广泛使用的视觉-语言模型替代方案,用于通用视觉异常分割的潜力。我们提出了一种新颖的范式,将异常分割统一为变化分割。这一范式使我们能够利用从现有图像数据集中衍生的大规模合成图像对,这些图像对包含对象级别和局部区域的变化,且独立于目标异常数据集。我们提出了一种用于通用异常分割的单提示元学习框架(MetaUAS),该框架在此合成数据集上训练,随后能够很好地泛化以分割现实世界中的任何新颖或未见过的视觉异常。为了处理提示图像与查询图像之间的几何变化,我们提出了一个软特征对齐模块,该模块桥接了成对图像的变化感知与单图像语义分割。这是首个不依赖特殊异常检测数据集和预训练视觉-语言模型,仅使用纯视觉模型实现通用异常分割的工作。我们的方法仅需一张正常图像提示即可高效地分割任何异常,且无需语言指导即可实现无训练。我们的MetaUAS显著优于以往的零样本、少样本甚至全样本异常分割方法。代码和预训练模型可在https://github.com/gaobb/MetaUAS获取。
English
Zero- and few-shot visual anomaly segmentation relies on powerful vision-language models that detect unseen anomalies using manually designed textual prompts. However, visual representations are inherently independent of language. In this paper, we explore the potential of a pure visual foundation model as an alternative to widely used vision-language models for universal visual anomaly segmentation. We present a novel paradigm that unifies anomaly segmentation into change segmentation. This paradigm enables us to leverage large-scale synthetic image pairs, featuring object-level and local region changes, derived from existing image datasets, which are independent of target anomaly datasets. We propose a one-prompt Meta-learning framework for Universal Anomaly Segmentation (MetaUAS) that is trained on this synthetic dataset and then generalizes well to segment any novel or unseen visual anomalies in the real world. To handle geometrical variations between prompt and query images, we propose a soft feature alignment module that bridges paired-image change perception and single-image semantic segmentation. This is the first work to achieve universal anomaly segmentation using a pure vision model without relying on special anomaly detection datasets and pre-trained visual-language models. Our method effectively and efficiently segments any anomalies with only one normal image prompt and enjoys training-free without guidance from language. Our MetaUAS significantly outperforms previous zero-shot, few-shot, and even full-shot anomaly segmentation methods. The code and pre-trained models are available at https://github.com/gaobb/MetaUAS.

Summary

AI-Generated Summary

PDF42May 16, 2025