告訴我原因:視覺基礎模型作為自我解釋的分類器
Tell me why: Visual foundation models as self-explainable classifiers
February 26, 2025
作者: Hugues Turbé, Mina Bjelogrlic, Gianmarco Mengaldo, Christian Lovis
cs.AI
摘要
視覺基礎模型(VFMs)因其卓越的性能而日益受到歡迎。然而,在關鍵應用中,可解釋性仍然至關重要。自解釋模型(SEM)旨在提供可解釋的分類器,將預測分解為可解釋概念的加權和。儘管前景看好,但最近的研究表明,這些解釋往往缺乏真實性。在本研究中,我們將VFMs與一種新穎的原型架構和專門的訓練目標相結合。通過在凍結的VFMs之上僅訓練一個輕量級頭部(約100萬參數),我們的方法(ProtoFM)提供了一種高效且可解釋的解決方案。評估結果表明,我們的方法在保持競爭力的分類性能的同時,在從文獻中衍生的一系列可解釋性指標上超越了現有模型。代碼可在https://github.com/hturbe/proto-fm獲取。
English
Visual foundation models (VFMs) have become increasingly popular due to their
state-of-the-art performance. However, interpretability remains crucial for
critical applications. In this sense, self-explainable models (SEM) aim to
provide interpretable classifiers that decompose predictions into a weighted
sum of interpretable concepts. Despite their promise, recent studies have shown
that these explanations often lack faithfulness. In this work, we combine VFMs
with a novel prototypical architecture and specialized training objectives. By
training only a lightweight head (approximately 1M parameters) on top of frozen
VFMs, our approach (ProtoFM) offers an efficient and interpretable solution.
Evaluations demonstrate that our approach achieves competitive classification
performance while outperforming existing models across a range of
interpretability metrics derived from the literature. Code is available at
https://github.com/hturbe/proto-fm.Summary
AI-Generated Summary