LANCE: 언어 기반 반사실적 이미지 생성으로 시각 모델의 강건성 테스트

초록

우리는 학습된 시각 모델을 스트레스 테스트하기 위해 언어 기반 반사실적 테스트 이미지(LANCE)를 생성하는 자동화 알고리즘을 제안합니다. 본 방법론은 대규모 언어 모델링과 텍스트 기반 이미지 편집의 최근 발전을 활용하여, 모델 가중치를 변경하지 않고도 IID 테스트 세트를 다양한, 현실적이며 도전적인 테스트 이미지들로 확장합니다. 우리는 생성된 데이터에 대해 다양한 사전 학습 모델들의 성능을 벤치마킹하고, 상당하고 일관된 성능 하락을 관찰했습니다. 또한, 다양한 유형의 편집에 대한 모델 민감도를 분석하고, 이를 통해 ImageNet에서 이전에 알려지지 않은 클래스 수준의 모델 편향을 발견하는 데의 적용 가능성을 입증했습니다.

English

We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pretrained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet.

LANCE: 언어 기반 반사실적 이미지 생성으로 시각 모델의 강건성 테스트

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

초록

Support