透過句子層級的早期干預來緩解物體幻覺問題

摘要

多模態大型語言模型（MLLMs）已徹底革新了跨模態理解，但仍面臨幻覺問題——即生成與視覺輸入相矛盾的虛構內容。現有的幻覺緩解方法要么計算成本過高，要么在訓練數據與模型輸出之間引入分佈不匹配。我們發現一個關鍵洞察：幻覺主要出現在文本生成的早期階段，並通過後續輸出傳播。為解決此問題，我們提出了**SENTINEL**（**S**entence-level **E**arly i**N**tervention **T**hrough **IN**-domain pr**E**ference **L**earning）框架，該框架消除了對人工註釋的依賴。具體而言，我們首先通過迭代採樣模型輸出、使用兩個開放詞彙檢測器交叉驗證對象存在性，並將句子分類為幻覺/非幻覺類別，來引導高質量的域內偏好對。隨後，我們使用上下文一致的正樣本和幻覺負樣本迭代構建上下文感知的偏好數據。最後，我們使用上下文感知偏好損失（C-DPO）訓練模型，該損失強調在幻覺最初顯現的句子層面進行判別學習。實驗結果顯示，與原始模型相比，SENTINEL能將幻覺減少超過90%，並在幻覺基準測試和通用能力基準測試上均優於先前的最先進方法，展示了其優越性和泛化能力。模型、數據集和代碼可在https://github.com/pspdada/SENTINEL獲取。

English

Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs. Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs. We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs. To address this, we propose **SENTINEL** (**S**entence-level **E**arly i**N**tervention **T**hrough **IN**-domain pr**E**ference **L**earning), a framework that eliminates dependency on human annotations. Specifically, we first bootstrap high-quality in-domain preference pairs by iteratively sampling model outputs, validating object existence through cross-checking with two open-vocabulary detectors, and classifying sentences into hallucinated/non-hallucinated categories. Subsequently, we use context-coherent positive samples and hallucinated negative samples to build context-aware preference data iteratively. Finally, we train models using a context-aware preference loss (C-DPO) that emphasizes discriminative learning at the sentence level where hallucinations initially manifest. Experimental results show that SENTINEL can reduce hallucinations by over 90\% compared to the original model and outperforms the previous state-of-the-art method on both hallucination benchmarks and general capabilities benchmarks, demonstrating its superiority and generalization ability. The models, datasets, and code are available at https://github.com/pspdada/SENTINEL.

透過句子層級的早期干預來緩解物體幻覺問題

Mitigating Object Hallucinations via Sentence-Level Early Intervention

摘要

Support