ChatPaper.aiChatPaper

MineTheGap:文本到图像模型偏差的自动挖掘

MineTheGap: Automatic Mining of Biases in Text-to-Image Models

December 15, 2025
作者: Noa Cohen, Nurit Spingarn-Eliezer, Inbar Huberman-Spiegelglas, Tomer Michaeli
cs.AI

摘要

文本到图像生成模型根据文本提示生成图像,但提示内容常使预期图像的某些方面存在模糊性。面对这些模糊描述时,TTI模型已表现出解释性偏差。这类偏差可能产生社会影响,例如在描述某种职业时仅呈现特定种族形象;当生成图像组内出现冗余而非多样化可能性时,也会影响用户体验。本文提出MineTheGap方法——一种能自动挖掘引发TTI模型产生偏差输出的提示词的技术。我们的方法不仅限于检测给定提示词的偏差,还利用遗传算法迭代优化提示词池,寻找那些能暴露偏差的提示词。该优化过程由新颖的偏差评分驱动,该评分根据偏差严重程度进行排序(我们在已知偏差数据集上验证了其有效性)。针对特定提示词,该评分通过对比生成图像的分布与基于提示词变体的大语言模型生成文本分布来计算。代码及示例详见项目网页。
English
Text-to-Image (TTI) models generate images based on text prompts, which often leave certain aspects of the desired image ambiguous. When faced with these ambiguities, TTI models have been shown to exhibit biases in their interpretations. These biases can have societal impacts, e.g., when showing only a certain race for a stated occupation. They can also affect user experience when creating redundancy within a set of generated images instead of spanning diverse possibilities. Here, we introduce MineTheGap - a method for automatically mining prompts that cause a TTI model to generate biased outputs. Our method goes beyond merely detecting bias for a given prompt. Rather, it leverages a genetic algorithm to iteratively refine a pool of prompts, seeking for those that expose biases. This optimization process is driven by a novel bias score, which ranks biases according to their severity, as we validate on a dataset with known biases. For a given prompt, this score is obtained by comparing the distribution of generated images to the distribution of LLM-generated texts that constitute variations on the prompt. Code and examples are available on the project's webpage.
PDF21December 23, 2025