注意力IoU:利用注意力图分析CelebA数据集中的偏差
Attention IoU: Examining Biases in CelebA using Attention Maps
March 25, 2025
作者: Aaron Serianni, Tyler Zhu, Olga Russakovsky, Vikram V. Ramaswamy
cs.AI
摘要
计算机视觉模型已被证明在多种数据集和任务中表现出并放大了偏见。现有的量化分类模型偏见的方法主要关注数据集分布和模型在子群体上的表现,而忽视了模型的内部工作机制。我们引入了注意力交并比(Attention-IoU)指标及其相关评分,这些指标利用注意力图来揭示模型内部表征中的偏见,并识别可能导致偏见的图像特征。首先,我们在合成的Waterbirds数据集上验证了Attention-IoU,证明该指标能够准确衡量模型偏见。随后,我们分析了CelebA数据集,发现Attention-IoU揭示了超出准确率差异的相关性。通过对“男性”这一受保护属性的个别属性进行考察,我们探讨了CelebA中偏见的不同表现形式。最后,通过对训练集进行子采样以改变属性相关性,我们展示了Attention-IoU能够揭示数据标签中未出现的潜在混淆变量。
English
Computer vision models have been shown to exhibit and amplify biases across a
wide array of datasets and tasks. Existing methods for quantifying bias in
classification models primarily focus on dataset distribution and model
performance on subgroups, overlooking the internal workings of a model. We
introduce the Attention-IoU (Attention Intersection over Union) metric and
related scores, which use attention maps to reveal biases within a model's
internal representations and identify image features potentially causing the
biases. First, we validate Attention-IoU on the synthetic Waterbirds dataset,
showing that the metric accurately measures model bias. We then analyze the
CelebA dataset, finding that Attention-IoU uncovers correlations beyond
accuracy disparities. Through an investigation of individual attributes through
the protected attribute of Male, we examine the distinct ways biases are
represented in CelebA. Lastly, by subsampling the training set to change
attribute correlations, we demonstrate that Attention-IoU reveals potential
confounding variables not present in dataset labels.Summary
AI-Generated Summary