マルチモーダル大規模言語モデルにおけるモダリティ選好の評価と制御

要旨

マルチモーダル大規模言語モデル（MLLMs）は、複雑なマルチモーダルコンテキストを伴うタスクにおいて顕著な性能を発揮している。しかし、マルチモーダルコンテキストを処理する際にモダリティ選好を示すかどうかについては、まだ十分に研究されていない。この問題を研究するため、我々はまず、制御された証拠競合シナリオの下でMC\textsuperscript{2}ベンチマークを構築し、モダリティ選好（マルチモーダルな競合証拠に基づいて意思決定を行う際に、あるモダリティを他よりも優先する傾向）を体系的に評価した。広範な評価の結果、テストされた18のMLLMsのすべてが一般的に明確なモダリティバイアスを示し、モダリティ選好は外部介入によって影響を受けることが明らかになった。詳細な分析により、選好の方向性はMLLMsの潜在表現内で捕捉可能であることが示された。これに基づき、追加のファインチューニングや入念に設計されたプロンプトを必要とせずに、モダリティ選好を明示的に制御するための表現エンジニアリングに基づくプロービングおよびステアリング手法を提案する。我々の手法は、望ましい方向へのモダリティ選好を効果的に増幅し、幻覚緩和やマルチモーダル機械翻訳などの下流タスクに適用され、有望な改善をもたらす。

English

Multimodal large language models (MLLMs) have achieved remarkable performance on complex tasks with multimodal context. However, it is still understudied whether they exhibit modality preference when processing multimodal contexts. To study this question, we first build a MC\textsuperscript{2} benchmark under controlled evidence conflict scenarios to systematically evaluate modality preference, which is the tendency to favor one modality over another when making decisions based on multimodal conflicting evidence. Our extensive evaluation reveals that all 18 tested MLLMs generally demonstrate clear modality bias, and modality preference can be influenced by external interventions. An in-depth analysis reveals that the preference direction can be captured within the latent representations of MLLMs. Built on this, we propose a probing and steering method based on representation engineering to explicitly control modality preference without additional fine-tuning or carefully crafted prompts. Our method effectively amplifies modality preference toward a desired direction and applies to downstream tasks such as hallucination mitigation and multimodal machine translation, yielding promising improvements.

マルチモーダル大規模言語モデルにおけるモダリティ選好の評価と制御

Evaluating and Steering Modality Preferences in Multimodal Large Language Model

要旨

Support