AUTOHALLUSION: 비전-언어 모델을 위한 환각 현상 벤치마크의 자동 생성

초록

대규모 시각-언어 모델(LVLMs)은 환각 현상을 보입니다: 이미지 내 특정 문맥 단서가 언어 모듈의 과도한 자신감과 비정상적이거나 가상의 객체에 대한 잘못된 추론을 유발할 수 있습니다. 비록 LVLM 환각 현상을 조사하기 위해 몇 가지 벤치마크가 개발되었지만, 이들은 주로 수작업으로 만든 극단적인 사례에 의존하며, 이러한 실패 패턴은 일반화하기 어렵고, 이를 기반으로 미세 조정을 하면 벤치마크의 타당성이 훼손될 수 있습니다. 이러한 문제점들은 우리가 최초의 자동 벤치마크 생성 접근법인 AUTOHALLUSION을 개발하도록 동기를 부여했습니다. AUTOHALLUSION은 다양한 환각 예제를 생성하기 위해 몇 가지 주요 전략을 활용합니다. 이는 LVLM의 언어 모듈을 문맥 단서에 대해 탐색하고, 이를 통해 이미지를 합성합니다: (1) 문맥 단서에 비정상적인 객체를 추가하거나; (2) 함께 발생하는 두 객체 중 하나를 유지하고 다른 하나를 제외하거나; (3) 문맥 단서와 밀접하게 연결된 객체를 제거하는 방식입니다. 그런 다음, 언어 모듈의 사전 지식과 모순되는 정답을 가진 이미지 기반 질문을 생성합니다. 모델은 정답에 도달하기 위해 문맥적 편향과 방해 요소를 극복해야 하며, 잘못되거나 일관성 없는 답변은 환각 현상을 나타냅니다. AUTOHALLUSION은 최소 비용으로 새로운 벤치마크를 생성할 수 있게 하여 수작업 벤치마크의 취약성을 극복합니다. 또한 일반적인 실패 패턴과 원인을 밝혀내어 환각 현상을 탐지, 방지 또는 제어하는 데 중요한 통찰을 제공합니다. GPT-4V(ision), Gemini Pro Vision, Claude 3, LLaVA-1.5 등 최상위 LVLM에 대한 포괄적인 평가 결과, AUTOHALLUSION의 합성 및 실제 데이터셋에서 각각 97.7%와 98.7%의 환각 유도 성공률을 보여주며, 환각 현상과의 장기적인 전투를 위한 길을 열었습니다.

English

Large vision-language models (LVLMs) hallucinate: certain context cues in an image may trigger the language module's overconfident and incorrect reasoning on abnormal or hypothetical objects. Though a few benchmarks have been developed to investigate LVLM hallucinations, they mainly rely on hand-crafted corner cases whose fail patterns may hardly generalize, and finetuning on them could undermine their validity. These motivate us to develop the first automatic benchmark generation approach, AUTOHALLUSION, that harnesses a few principal strategies to create diverse hallucination examples. It probes the language modules in LVLMs for context cues and uses them to synthesize images by: (1) adding objects abnormal to the context cues; (2) for two co-occurring objects, keeping one and excluding the other; or (3) removing objects closely tied to the context cues. It then generates image-based questions whose ground-truth answers contradict the language module's prior. A model has to overcome contextual biases and distractions to reach correct answers, while incorrect or inconsistent answers indicate hallucinations. AUTOHALLUSION enables us to create new benchmarks at the minimum cost and thus overcomes the fragility of hand-crafted benchmarks. It also reveals common failure patterns and reasons, providing key insights to detect, avoid, or control hallucinations. Comprehensive evaluations of top-tier LVLMs, e.g., GPT-4V(ision), Gemini Pro Vision, Claude 3, and LLaVA-1.5, show a 97.7% and 98.7% success rate of hallucination induction on synthetic and real-world datasets of AUTOHALLUSION, paving the way for a long battle against hallucinations.

AUTOHALLUSION: 비전-언어 모델을 위한 환각 현상 벤치마크의 자동 생성

AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

초록

Support