ChatPaper.aiChatPaper

IF-Bench基准框架:通过生成式视觉提示技术提升多模态大模型在红外图像领域的性能评测与优化

IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting

December 10, 2025
作者: Tao Zhang, Yuyang Hong, Yang Xia, Kun Ding, Zeyu Zhang, Ying Wang, Shiming Xiang, Chunhong Pan
cs.AI

摘要

近年来,多模态大语言模型(MLLMs)的快速发展在各种基准测试中取得了显著进展。然而,其在红外图像理解方面的能力仍有待探索。为填补这一空白,我们推出了IF-Bench——首个用于评估红外图像多模态理解能力的高质量基准。该基准包含从23个红外数据集中选取的499张图像,以及精心构建的680组视觉问答对,涵盖图像理解的10个核心维度。基于此基准,我们系统评估了40余个开源与闭源MLLMs,采用循环评估、双语测试和混合判读策略以提升结果可靠性。分析揭示了模型规模、架构及推理范式对红外图像理解的影响,为该领域提供了重要洞见。此外,我们提出一种免训练的生成式视觉提示(GenViP)方法,通过先进图像编辑模型将红外图像转换为语义和空间对齐的RGB对应图像,从而缓解领域分布偏移问题。大量实验表明,该方法能在各类MLLMs中持续带来显著性能提升。基准数据与代码已开源:https://github.com/casiatao/IF-Bench。
English
Recent advances in multimodal large language models (MLLMs) have led to impressive progress across various benchmarks. However, their capability in understanding infrared images remains unexplored. To address this gap, we introduce IF-Bench, the first high-quality benchmark designed for evaluating multimodal understanding of infrared images. IF-Bench consists of 499 images sourced from 23 infrared datasets and 680 carefully curated visual question-answer pairs, covering 10 essential dimensions of image understanding. Based on this benchmark, we systematically evaluate over 40 open-source and closed-source MLLMs, employing cyclic evaluation, bilingual assessment, and hybrid judgment strategies to enhance the reliability of the results. Our analysis reveals how model scale, architecture, and inference paradigms affect infrared image comprehension, providing valuable insights for this area. Furthermore, we propose a training-free generative visual prompting (GenViP) method, which leverages advanced image editing models to translate infrared images into semantically and spatially aligned RGB counterparts, thereby mitigating domain distribution shifts. Extensive experiments demonstrate that our method consistently yields significant performance improvements across a wide range of MLLMs. The benchmark and code are available at https://github.com/casiatao/IF-Bench.
PDF32December 13, 2025