视觉语言模型抗幻象在线自校准技术

摘要

大型视觉语言模型（LVLM）普遍存在幻觉问题，常生成输入图像中不存在的视觉细节描述。现有的偏好对齐方法通常依赖于从GPT等更强模型中提取的监督信号，但这种离线范式会引发"监督-感知失配"问题：学生模型被迫对齐超出其感知能力的细粒度细节，从而学会猜测而非观察。为实现在线学习的可靠自监督，我们发现了LVLM中存在的"生成-判别差距"——模型在判别式验证任务上的表现显著优于开放式生成任务。基于此发现，我们提出在线自校准框架OSCAR，该框架将蒙特卡洛树搜索与双粒度奖励机制相结合，构建偏好数据并通过直接偏好优化进行迭代式模型优化。大量实验表明，OSCAR在幻觉评测基准上达到最先进性能，同时提升了通用多模态能力。

English

Large Vision-Language Models (LVLMs) often suffer from hallucinations, generating descriptions that include visual details absent from the input image. Recent preference alignment methods typically rely on supervision distilled from stronger models such as GPT. However, this offline paradigm introduces a Supervision-Perception Mismatch: the student model is forced to align with fine-grained details beyond its perceptual capacity, learning to guess rather than to see. To obtain reliable self-supervision for online learning, we identify a Generative-Discriminative Gap within LVLMs, where models exhibit higher accuracy on discriminative verification than open-ended generation. Leveraging this capability, we propose Online Self-CAlibRation (OSCAR), a framework that integrates Monte Carlo Tree Search with a Dual-Granularity Reward Mechanism to construct preference data and iteratively refines the model via Direct Preference Optimization. Extensive experiments demonstrate that OSCAR achieves state-of-the-art performance on hallucination benchmarks while improving general multimodal capabilities.

视觉语言模型抗幻象在线自校准技术

Online Self-Calibration Against Hallucination in Vision-Language Models

摘要

Support