注意力IoU：利用注意力圖檢視CelebA數據集中的偏差

摘要

電腦視覺模型已被證實會在多種資料集和任務中展現並放大偏見。現有的分類模型偏見量化方法主要聚焦於資料集分佈和模型在子群體上的表現，而忽略了模型的內部運作機制。我們提出了注意力交並比（Attention-IoU）指標及其相關評分，該方法利用注意力圖來揭示模型內部表徵中的偏見，並識別可能導致這些偏見的圖像特徵。首先，我們在合成的Waterbirds資料集上驗證了Attention-IoU，證明該指標能準確測量模型偏見。接著，我們分析了CelebA資料集，發現Attention-IoU能揭示超出準確率差異之外的相關性。通過以“男性”這一受保護屬性為例，我們探討了CelebA中偏見表現的不同方式。最後，通過對訓練集進行子採樣以改變屬性相關性，我們展示了Attention-IoU能夠揭示資料集標籤中未出現的潛在混淆變量。

English

Computer vision models have been shown to exhibit and amplify biases across a wide array of datasets and tasks. Existing methods for quantifying bias in classification models primarily focus on dataset distribution and model performance on subgroups, overlooking the internal workings of a model. We introduce the Attention-IoU (Attention Intersection over Union) metric and related scores, which use attention maps to reveal biases within a model's internal representations and identify image features potentially causing the biases. First, we validate Attention-IoU on the synthetic Waterbirds dataset, showing that the metric accurately measures model bias. We then analyze the CelebA dataset, finding that Attention-IoU uncovers correlations beyond accuracy disparities. Through an investigation of individual attributes through the protected attribute of Male, we examine the distinct ways biases are represented in CelebA. Lastly, by subsampling the training set to change attribute correlations, we demonstrate that Attention-IoU reveals potential confounding variables not present in dataset labels.

注意力IoU：利用注意力圖檢視CelebA數據集中的偏差

Attention IoU: Examining Biases in CelebA using Attention Maps

摘要

Support