AdamMeme: 有害性に関するマルチモーダル大規模言語モデルの推論能力の適応的プローブ

要旨

ソーシャルメディア時代におけるマルチモーダルミームの普及は、マルチモーダル大規模言語モデル（mLLM）がミームの有害性を効果的に理解することを求めている。既存の有害ミーム理解に関するmLLMの評価ベンチマークは、静的データセットを用いた精度ベースのモデル非依存評価に依存している。これらのベンチマークは、オンラインミームが動的に進化するため、最新かつ徹底的な評価を提供する能力に限界がある。この問題に対処するため、我々はAdamMemeを提案する。これは、ミームの有害性を解読する際のmLLMの推論能力を適応的に探る柔軟なエージェントベースの評価フレームワークである。マルチエージェント協調を通じて、AdamMemeは挑戦的なサンプルでミームデータを反復的に更新し、mLLMが有害性を解釈する際の特定の限界を明らかにすることで、包括的な評価を提供する。大規模な実験により、本フレームワークが異なるターゲットmLLMの性能のばらつきを体系的に明らかにし、モデル固有の弱点に関する詳細で細かい分析を提供することが示された。コードはhttps://github.com/Lbotirx/AdamMemeで公開されている。

English

The proliferation of multimodal memes in the social media era demands that multimodal Large Language Models (mLLMs) effectively understand meme harmfulness. Existing benchmarks for assessing mLLMs on harmful meme understanding rely on accuracy-based, model-agnostic evaluations using static datasets. These benchmarks are limited in their ability to provide up-to-date and thorough assessments, as online memes evolve dynamically. To address this, we propose AdamMeme, a flexible, agent-based evaluation framework that adaptively probes the reasoning capabilities of mLLMs in deciphering meme harmfulness. Through multi-agent collaboration, AdamMeme provides comprehensive evaluations by iteratively updating the meme data with challenging samples, thereby exposing specific limitations in how mLLMs interpret harmfulness. Extensive experiments show that our framework systematically reveals the varying performance of different target mLLMs, offering in-depth, fine-grained analyses of model-specific weaknesses. Our code is available at https://github.com/Lbotirx/AdamMeme.

AdamMeme: 有害性に関するマルチモーダル大規模言語モデルの推論能力の適応的プローブ

AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness

要旨

Support