비전 언어 모델은 편향성을 지니고 있다

초록

대형 언어 모델(LLMs)은 인터넷에서 방대한 양의 사전 지식을 기억하고 있어 다운스트림 작업에 도움을 주지만, 잘못되거나 편향된 답변을 출력하는 것으로 악명이 높기도 합니다. 본 연구에서는 대중적인 주제에 대한 지식이 시각 언어 모델(VLMs)의 정확도에 미치는 영향을, 객관적인 시각 작업인 계수 및 식별 작업에서 테스트합니다. 우리는 최첨단 VLMs이 강한 편향을 보이며(예: 3줄의 아디다스 로고에 네 번째 줄이 추가된 것을 인식하지 못함), 동물, 로고, 체스, 보드 게임, 착시, 패턴 그리드 등 7가지 다양한 도메인에서 평균 17.05%의 계수 정확도(예: 아디다스와 유사한 로고의 줄 수 세기)를 기록한다는 것을 발견했습니다. 주제 이름을 설명하는 텍스트(예: "아디다스")를 반사실적 이미지에 삽입하면 VLM의 정확도가 더욱 감소합니다. VLMs의 편향이 너무 강력하여 결과를 다시 확인하거나 이미지 세부 사항에만 의존하도록 지시해도 계수 정확도가 평균 +2점밖에 향상되지 않습니다. 본 연구는 VLMs의 흥미로운 실패 모드와 VLM 편향을 테스트하기 위한 자동화된 프레임워크를 제시합니다. 코드와 데이터는 vlmsarebiased.github.io에서 확인할 수 있습니다.

English

Large language models (LLMs) memorize a vast amount of prior knowledge from the Internet that help them on downstream tasks but also may notoriously sway their outputs towards wrong or biased answers. In this work, we test how the knowledge about popular subjects hurt the accuracy of vision language models (VLMs) on standard, objective visual tasks of counting and identification. We find that state-of-the-art VLMs are strongly biased (e.g, unable to recognize a fourth stripe has been added to a 3-stripe Adidas logo) scoring an average of 17.05% accuracy in counting (e.g., counting stripes in an Adidas-like logo) across 7 diverse domains from animals, logos, chess, board games, optical illusions, to patterned grids. Insert text (e.g., "Adidas") describing the subject name into the counterfactual image further decreases VLM accuracy. The biases in VLMs are so strong that instructing them to double-check their results or rely exclusively on image details to answer improves counting accuracy by only +2 points, on average. Our work presents an interesting failure mode in VLMs and an automated framework for testing VLM biases. Code and data are available at: vlmsarebiased.github.io.

비전 언어 모델은 편향성을 지니고 있다

Vision Language Models are Biased

초록

Support