视觉语言模型抗幻觉在线自校准方法

摘要

大型视觉语言模型（LVLM）常出现幻觉现象，其生成的描述会包含输入图像中不存在的视觉细节。现有的偏好对齐方法通常依赖从GPT等更强模型中提取的监督信号，但这种离线范式会引发监督与感知错配：学生模型被迫对齐超出其感知能力的细粒度细节，从而学会猜测而非观察。为实现在线学习的可靠自监督，我们发现了LVLM内部存在的生成-判别差距——模型在判别式验证任务上的准确率显著高于开放式生成任务。基于此发现，我们提出在线自校准框架OSCAR，该框架将蒙特卡洛树搜索与双粒度奖励机制相结合构建偏好数据，并通过直接偏好优化实现模型迭代增强。大量实验表明，OSCAR在幻觉评测基准上达到最先进性能，同时提升了通用多模态能力。

English

Large Vision-Language Models (LVLMs) often suffer from hallucinations, generating descriptions that include visual details absent from the input image. Recent preference alignment methods typically rely on supervision distilled from stronger models such as GPT. However, this offline paradigm introduces a Supervision-Perception Mismatch: the student model is forced to align with fine-grained details beyond its perceptual capacity, learning to guess rather than to see. To obtain reliable self-supervision for online learning, we identify a Generative-Discriminative Gap within LVLMs, where models exhibit higher accuracy on discriminative verification than open-ended generation. Leveraging this capability, we propose Online Self-CAlibRation (OSCAR), a framework that integrates Monte Carlo Tree Search with a Dual-Granularity Reward Mechanism to construct preference data and iteratively refines the model via Direct Preference Optimization. Extensive experiments demonstrate that OSCAR achieves state-of-the-art performance on hallucination benchmarks while improving general multimodal capabilities.

视觉语言模型抗幻觉在线自校准方法

Online Self-Calibration Against Hallucination in Vision-Language Models

摘要

Support