多模态LLM中对齐的理解:一项全面研究
Understanding Alignment in Multimodal LLMs: A Comprehensive Study
July 2, 2024
作者: Elmira Amirloo, Jean-Philippe Fauconnier, Christoph Roesmann, Christian Kerl, Rinu Boney, Yusu Qian, Zirui Wang, Afshin Dehghan, Yinfei Yang, Zhe Gan, Peter Grasch
cs.AI
摘要
偏好对齐已成为提升大型语言模型(LLMs)性能的关键组成部分,然而在多模态大型语言模型(MLLMs)中的影响相对未被深入探讨。与语言模型类似,用于图像理解任务的MLLMs面临幻觉等挑战。在MLLMs中,幻觉不仅可能导致陈述不准确的事实,还可能产生与图像内容不一致的响应。MLLMs对齐的主要目标是鼓励这些模型将响应与图像信息更紧密地对齐。最近,多项研究引入了用于MLLMs的偏好数据集,并研究了不同的对齐方法,包括直接偏好优化(DPO)和近端策略优化(PPO)。然而,由于数据集、基础模型类型和对齐方法的差异,目前尚不清楚这些研究中哪些具体因素对所报道的改进起到了最重要的贡献。本文独立分析了MLLMs偏好对齐的每个方面。我们首先将对齐算法分为两组,离线(如DPO)和在线(如在线-DPO),并展示结合离线和在线方法可以在某些场景中提升模型性能。我们回顾了各种已发表的多模态偏好数据集,并讨论它们构建细节如何影响模型性能。基于这些见解,我们引入了一种称为偏见驱动幻觉采样(BDHS)的创新多模态偏好数据创建方式,既不需要额外标注也不需要外部模型,并展示它在一系列基准测试中可以实现与先前发表的多模态模型对齐工作具有竞争力的性能。
English
Preference alignment has become a crucial component in enhancing the
performance of Large Language Models (LLMs), yet its impact in Multimodal Large
Language Models (MLLMs) remains comparatively underexplored. Similar to
language models, MLLMs for image understanding tasks encounter challenges like
hallucination. In MLLMs, hallucination can occur not only by stating incorrect
facts but also by producing responses that are inconsistent with the image
content. A primary objective of alignment for MLLMs is to encourage these
models to align responses more closely with image information. Recently,
multiple works have introduced preference datasets for MLLMs and examined
different alignment methods, including Direct Preference Optimization (DPO) and
Proximal Policy Optimization (PPO). However, due to variations in datasets,
base model types, and alignment methods, it remains unclear which specific
elements contribute most significantly to the reported improvements in these
works. In this paper, we independently analyze each aspect of preference
alignment in MLLMs. We start by categorizing the alignment algorithms into two
groups, offline (such as DPO), and online (such as online-DPO), and show that
combining offline and online methods can improve the performance of the model
in certain scenarios. We review a variety of published multimodal preference
datasets and discuss how the details of their construction impact model
performance. Based on these insights, we introduce a novel way of creating
multimodal preference data called Bias-Driven Hallucination Sampling (BDHS)
that needs neither additional annotation nor external models, and show that it
can achieve competitive performance to previously published alignment work for
multimodal models across a range of benchmarks.Summary
AI-Generated Summary