Attention IoU: Onderzoek naar Vooroordelen in CelebA met behulp van Attention Maps

Samenvatting

Computervisie-modellen blijken vooroordelen te vertonen en te versterken over een breed scala aan datasets en taken. Bestaande methoden voor het kwantificeren van vooroordelen in classificatiemodellen richten zich voornamelijk op datasetdistributie en modelprestaties op subgroepen, waarbij de interne werking van een model over het hoofd wordt gezien. Wij introduceren de Attention-IoU (Attention Intersection over Union) metriek en gerelateerde scores, die aandachtkaarten gebruiken om vooroordelen binnen de interne representaties van een model te onthullen en beeldkenmerken te identificeren die mogelijk de vooroordelen veroorzaken. Eerst valideren we Attention-IoU op de synthetische Waterbirds-dataset, waarbij we aantonen dat de metriek modelvooroordelen nauwkeurig meet. Vervolgens analyseren we de CelebA-dataset en ontdekken dat Attention-IoU correlaties blootlegt die verder gaan dan nauwkeurigheidsverschillen. Door individuele attributen te onderzoeken via het beschermde attribuut 'Man', bestuderen we de verschillende manieren waarop vooroordelen in CelebA worden gerepresenteerd. Ten slotte demonstreren we, door het subsamplen van de trainingsset om attribuutcorrelaties te veranderen, dat Attention-IoU potentiële verstorende variabelen onthult die niet aanwezig zijn in de datasetlabels.

English

Computer vision models have been shown to exhibit and amplify biases across a wide array of datasets and tasks. Existing methods for quantifying bias in classification models primarily focus on dataset distribution and model performance on subgroups, overlooking the internal workings of a model. We introduce the Attention-IoU (Attention Intersection over Union) metric and related scores, which use attention maps to reveal biases within a model's internal representations and identify image features potentially causing the biases. First, we validate Attention-IoU on the synthetic Waterbirds dataset, showing that the metric accurately measures model bias. We then analyze the CelebA dataset, finding that Attention-IoU uncovers correlations beyond accuracy disparities. Through an investigation of individual attributes through the protected attribute of Male, we examine the distinct ways biases are represented in CelebA. Lastly, by subsampling the training set to change attribute correlations, we demonstrate that Attention-IoU reveals potential confounding variables not present in dataset labels.

Attention IoU: Onderzoek naar Vooroordelen in CelebA met behulp van Attention Maps

Attention IoU: Examining Biases in CelebA using Attention Maps

Samenvatting

Support