ChatPaper.aiChatPaper

视觉语言模型时代的广义场外检测及更多:一项调查

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

July 31, 2024
作者: Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang, Yifei Ming, Yueqian Lin, Qing Yu, Go Irie, Shafiq Joty, Yixuan Li, Hai Li, Ziwei Liu, Toshihiko Yamasaki, Kiyoharu Aizawa
cs.AI

摘要

检测出分布外(OOD)样本对确保机器学习系统的安全至关重要,并且已经塑造了OOD检测领域。同时,还有几个与OOD检测密切相关的问题,包括异常检测(AD)、新颖性检测(ND)、开放集识别(OSR)和离群值检测(OD)。为了统一这些问题,提出了一个广义OOD检测框架,对这五个问题进行了分类。然而,视觉语言模型(VLMs)如CLIP已经显著改变了范式,并模糊了这些领域之间的界限,再次令研究人员感到困惑。在本调查中,我们首先提出了一个广义OOD检测v2,概括了AD、ND、OSR、OOD检测和OD在VLM时代的演变。我们的框架揭示了,通过一些领域的不活跃和整合,具有挑战性的问题已经变成了OOD检测和AD。此外,我们还强调了定义、问题设置和基准的显著变化;因此,我们对OOD检测方法论进行了全面回顾,包括讨论其他相关任务以澄清它们与OOD检测的关系。最后,我们探讨了新兴大型视觉语言模型(LVLM)时代的进展,例如GPT-4V。我们以对未来挑战和方向的探讨结束本调查。
English
Detecting out-of-distribution (OOD) samples is crucial for ensuring the safety of machine learning systems and has shaped the field of OOD detection. Meanwhile, several other problems are closely related to OOD detection, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD). To unify these problems, a generalized OOD detection framework was proposed, taxonomically categorizing these five problems. However, Vision Language Models (VLMs) such as CLIP have significantly changed the paradigm and blurred the boundaries between these fields, again confusing researchers. In this survey, we first present a generalized OOD detection v2, encapsulating the evolution of AD, ND, OSR, OOD detection, and OD in the VLM era. Our framework reveals that, with some field inactivity and integration, the demanding challenges have become OOD detection and AD. In addition, we also highlight the significant shift in the definition, problem settings, and benchmarks; we thus feature a comprehensive review of the methodology for OOD detection, including the discussion over other related tasks to clarify their relationship to OOD detection. Finally, we explore the advancements in the emerging Large Vision Language Model (LVLM) era, such as GPT-4V. We conclude this survey with open challenges and future directions.

Summary

AI-Generated Summary

PDF62November 28, 2024