多モーダル幻覚の段階的自己報酬による緩和

要旨

大規模視覚言語モデル（LVLM）は、生成された応答が視覚的入力と一致しない「視覚幻覚」の問題に依然として悩まされている。既存の手法は、大規模な計算コストを伴う注釈付きデータを用いたファインチューニングに依存するか、幻覚発生の動的な性質を考慮しない静的な事後処理戦略を採用している。これらの問題に対処するため、本論文では外部監督を必要とせず、推論時に動的に幻覚を軽減する新しい自己報酬フレームワークを提案する。実証的側面では、視覚幻覚が段階的な動的パターンを示し、各意味的段階の開始時にピークに達することを明らかにする。この知見に基づき、段階的な自己報酬信号に導かれたオンライン幻覚補正手法PSRD（段階的 **自己報酬復号化**）を提案する。復号化中の反復的な自己評価コストを削減するため、LVLMから幻覚ガイダンス信号を軽量な報酬モデルへ蒸留する。この報酬モデルは、復号化プロセスにおける標的介入のためのオンザフライなガイダンスを提供し、精密な幻覚抑制を可能にする。提案するPSRDは、LLaVA-1.5-7Bの幻覚発生率を50.0%大幅に削減し、4つのLVLMにおける5つの幻覚評価ベンチマークで既存の事後処理手法を一貫して上回る。さらなる分析により、PSRDが幻覚の伝播を効果的に軽減し、高性能と推論効率の間の高度に制御可能なトレードオフを達成することが確認された。

English

Large Vision-Language Models (LVLMs) still struggle with vision hallucination, where generated responses are inconsistent with the visual input. Existing methods either rely on large-scale annotated data for fine-tuning, which incurs massive computational overhead, or employ static post-hoc strategies that overlook the dynamic nature of hallucination emergence. To address these, we introduce a new self-rewarding framework, enabling dynamic hallucination mitigation at inference time without external supervision. On the empirical side, we reveal that visual hallucination exhibits phase-wise dynamic patterns, peaking at the onset of each semantic phase. Drawing on these insights, we propose PSRD (Phase-wise \textbf{Self-Reward Decoding) for online hallucination correction guided by phase-wise self-reward signals. To reduce the cost of repeated self-evaluation during decoding, we distill the hallucination guidance signal from LVLMs into a lightweight reward model. The reward model subsequently provides on-the-fly guidance for targeted intervention during the decoding process, enabling precise hallucination suppression. The proposed PSRD significantly reduces the hallucination rate of LLaVA-1.5-7B by 50.0% and consistently outperforms existing post-hoc methods across five hallucination evaluation benchmarks for four LVLMs. Further analysis confirms that PSRD effectively mitigates hallucination propagation and achieves a highly controllable trade-off between strong performance and inference efficiency.

多モーダル幻覚の段階的自己報酬による緩和

Mitigating Multimodal Hallucination via Phase-wise Self-reward

要旨

Support