ChatPaper.aiChatPaper

AnyAnomaly:基於LVLM的零樣本可定制視頻異常檢測

AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM

March 6, 2025
作者: Sunghyun Ahn, Youngwan Jo, Kijung Lee, Sein Kwon, Inpyo Hong, Sanghyun Park
cs.AI

摘要

影片異常檢測(Video Anomaly Detection, VAD)在電腦視覺的影片分析與監控中扮演著關鍵角色。然而,現有的VAD模型依賴於學習到的正常模式,這使得它們難以應用於多樣化的環境。因此,用戶需要針對新環境重新訓練模型或開發獨立的AI模型,這不僅需要機器學習的專業知識、高效能硬體,還需進行大量的數據收集,從而限制了VAD的實際應用性。為解決這些挑戰,本研究提出了可自訂的影片異常檢測技術(Customizable Video Anomaly Detection, C-VAD)以及AnyAnomaly模型。C-VAD將用戶定義的文本視為異常事件,並檢測影片中包含指定事件的幀。我們有效地實現了AnyAnomaly,利用上下文感知的視覺問答技術,而無需對大型視覺語言模型進行微調。為驗證所提模型的有效性,我們構建了C-VAD數據集,並展示了AnyAnomaly的優越性。此外,我們的方法在VAD基準數據集上展現了競爭力的表現,在UBnormal數據集上達到了最先進的成果,並在所有數據集的泛化能力上超越了其他方法。我們的程式碼已公開於github.com/SkiddieAhn/Paper-AnyAnomaly。
English
Video anomaly detection (VAD) is crucial for video analysis and surveillance in computer vision. However, existing VAD models rely on learned normal patterns, which makes them difficult to apply to diverse environments. Consequently, users should retrain models or develop separate AI models for new environments, which requires expertise in machine learning, high-performance hardware, and extensive data collection, limiting the practical usability of VAD. To address these challenges, this study proposes customizable video anomaly detection (C-VAD) technique and the AnyAnomaly model. C-VAD considers user-defined text as an abnormal event and detects frames containing a specified event in a video. We effectively implemented AnyAnomaly using a context-aware visual question answering without fine-tuning the large vision language model. To validate the effectiveness of the proposed model, we constructed C-VAD datasets and demonstrated the superiority of AnyAnomaly. Furthermore, our approach showed competitive performance on VAD benchmark datasets, achieving state-of-the-art results on the UBnormal dataset and outperforming other methods in generalization across all datasets. Our code is available online at github.com/SkiddieAhn/Paper-AnyAnomaly.

Summary

AI-Generated Summary

PDF32March 10, 2025