생성하되 검증하라: 회고적 리샘플링을 통한 시각-언어 모델의 환각 현상 감소

초록

비전-언어 모델(VLMs)은 시각적 이해 능력에서 뛰어나지만, 종종 존재하지 않는 객체, 행동 또는 개념에 대한 설명을 생성하는 시각적 환각(visual hallucination) 문제를 겪습니다. 이는 안전이 중요한 응용 분야에서 상당한 위험을 초래할 수 있습니다. 기존의 환각 완화 방법은 일반적으로 두 가지 패러다임 중 하나를 따릅니다: 텍스트를 시각적 입력과 일치시키기 위해 디코딩 행동을 수정하는 생성 조정(generation adjustment)과, 외부 모델이 출력을 평가하고 수정하는 사후 검증(post-hoc verification)입니다. 생성 조정 방법은 효과적이지만 휴리스틱에 의존하며 수정 메커니즘이 부족한 반면, 사후 검증은 복잡하고 일반적으로 여러 모델을 필요로 하며 출력을 개선하기보다는 거부하는 경향이 있습니다. 본 연구에서는 환각 인지 학습과 실시간 자체 검증을 통합한 통합 프레임워크인 REVERSE를 소개합니다. 130만 개 이상의 반합성 샘플로 구성된 새로운 환각 검증 데이터셋과 새로운 추론 시점 회고적 리샘플링 기술을 활용하여, 우리의 접근 방식은 VLMs이 생성 중에 환각을 감지하고 동적으로 수정할 수 있도록 합니다. 평가 결과, REVERSE는 CHAIR-MSCOCO에서 최대 12%, HaloQuest에서 28%까지 기존 최고의 방법을 능가하는 최첨단 환각 감소 성능을 달성했습니다. 우리의 데이터셋, 모델 및 코드는 https://reverse-vlm.github.io에서 확인할 수 있습니다.

English

Vision-Language Models (VLMs) excel at visual understanding but often suffer from visual hallucinations, where they generate descriptions of nonexistent objects, actions, or concepts, posing significant risks in safety-critical applications. Existing hallucination mitigation methods typically follow one of two paradigms: generation adjustment, which modifies decoding behavior to align text with visual inputs, and post-hoc verification, where external models assess and correct outputs. While effective, generation adjustment methods often rely on heuristics and lack correction mechanisms, while post-hoc verification is complicated, typically requiring multiple models and tending to reject outputs rather than refine them. In this work, we introduce REVERSE, a unified framework that integrates hallucination-aware training with on-the-fly self-verification. By leveraging a new hallucination-verification dataset containing over 1.3M semi-synthetic samples, along with a novel inference-time retrospective resampling technique, our approach enables VLMs to both detect hallucinations during generation and dynamically revise those hallucinations. Our evaluations show that REVERSE achieves state-of-the-art hallucination reduction, outperforming the best existing methods by up to 12% on CHAIR-MSCOCO and 28% on HaloQuest. Our dataset, model, and code are available at: https://reverse-vlm.github.io.

생성하되 검증하라: 회고적 리샘플링을 통한 시각-언어 모델의 환각 현상 감소

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

초록

Support