注意力IoU：利用注意力图分析CelebA数据集中的偏差

摘要

计算机视觉模型已被证明在多种数据集和任务中表现出并放大了偏见。现有的量化分类模型偏见的方法主要关注数据集分布和模型在子群体上的表现，而忽视了模型的内部工作机制。我们引入了注意力交并比（Attention-IoU）指标及其相关评分，这些指标利用注意力图来揭示模型内部表征中的偏见，并识别可能导致偏见的图像特征。首先，我们在合成的Waterbirds数据集上验证了Attention-IoU，证明该指标能够准确衡量模型偏见。随后，我们分析了CelebA数据集，发现Attention-IoU揭示了超出准确率差异的相关性。通过对“男性”这一受保护属性的个别属性进行考察，我们探讨了CelebA中偏见的不同表现形式。最后，通过对训练集进行子采样以改变属性相关性，我们展示了Attention-IoU能够揭示数据标签中未出现的潜在混淆变量。

English

Computer vision models have been shown to exhibit and amplify biases across a wide array of datasets and tasks. Existing methods for quantifying bias in classification models primarily focus on dataset distribution and model performance on subgroups, overlooking the internal workings of a model. We introduce the Attention-IoU (Attention Intersection over Union) metric and related scores, which use attention maps to reveal biases within a model's internal representations and identify image features potentially causing the biases. First, we validate Attention-IoU on the synthetic Waterbirds dataset, showing that the metric accurately measures model bias. We then analyze the CelebA dataset, finding that Attention-IoU uncovers correlations beyond accuracy disparities. Through an investigation of individual attributes through the protected attribute of Male, we examine the distinct ways biases are represented in CelebA. Lastly, by subsampling the training set to change attribute correlations, we demonstrate that Attention-IoU reveals potential confounding variables not present in dataset labels.

注意力IoU：利用注意力图分析CelebA数据集中的偏差

Attention IoU: Examining Biases in CelebA using Attention Maps

摘要

Support