概率性概念解释器:视觉基础模型的可信概念解释
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models
June 18, 2024
作者: Hengyi Wang, Shiwei Tan, Hao Wang
cs.AI
摘要
视觉Transformer(ViTs)已成为一个重要的研究领域,特别是因为它们能够与大型语言模型联合训练,并作为强大的视觉基础模型。然而,针对ViTs的可信解释方法的发展滞后,特别是在后续解释ViT预测的背景下。现有的子图像选择方法,如特征归因和概念模型,在这方面表现不佳。本文提出了解释ViTs的五个愿望 -- 忠实性、稳定性、稀疏性、多级结构和简洁性 -- 并展示了当前方法在全面满足这些标准方面的不足。我们引入了一种变分贝叶斯解释框架,称为ProbAbilistic Concept Explainers(PACE),该框架对补丁嵌入的分布进行建模,以提供可信赖的后续概念解释。我们的定性分析揭示了补丁级别概念的分布,通过对补丁嵌入和ViT预测的联合分布进行建模,阐明了ViTs的有效性。此外,这些补丁级别的解释弥合了图像级别和数据集级别解释之间的差距,从而完成了PACE的多级结构。通过在合成和真实数据集上进行大量实验,我们证明了PACE在所定义的愿望方面超越了现有技术方法。
English
Vision transformers (ViTs) have emerged as a significant area of focus,
particularly for their capacity to be jointly trained with large language
models and to serve as robust vision foundation models. Yet, the development of
trustworthy explanation methods for ViTs has lagged, particularly in the
context of post-hoc interpretations of ViT predictions. Existing sub-image
selection approaches, such as feature-attribution and conceptual models, fall
short in this regard. This paper proposes five desiderata for explaining ViTs
-- faithfulness, stability, sparsity, multi-level structure, and parsimony --
and demonstrates the inadequacy of current methods in meeting these criteria
comprehensively. We introduce a variational Bayesian explanation framework,
dubbed ProbAbilistic Concept Explainers (PACE), which models the distributions
of patch embeddings to provide trustworthy post-hoc conceptual explanations.
Our qualitative analysis reveals the distributions of patch-level concepts,
elucidating the effectiveness of ViTs by modeling the joint distribution of
patch embeddings and ViT's predictions. Moreover, these patch-level
explanations bridge the gap between image-level and dataset-level explanations,
thus completing the multi-level structure of PACE. Through extensive
experiments on both synthetic and real-world datasets, we demonstrate that PACE
surpasses state-of-the-art methods in terms of the defined desiderata.Summary
AI-Generated Summary