DeLeaker:动态推理时重加权技术用于文本到图像模型中的语义泄露缓解
DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation in Text-to-Image Models
October 16, 2025
作者: Mor Ventura, Michael Toker, Or Patashnik, Yonatan Belinkov, Roi Reichart
cs.AI
摘要
文本到图像(T2I)模型发展迅速,但仍易受语义泄露的影响,即不同实体间语义相关特征的无意传递。现有的缓解策略多基于优化或依赖外部输入。我们提出了DeLeaker,一种轻量级、无需优化的推理时方法,通过直接干预模型的注意力图来减轻泄露。在整个扩散过程中,DeLeaker动态重加权注意力图,以抑制过度的跨实体交互,同时强化每个实体的身份特征。为支持系统评估,我们引入了SLIM(图像中的语义泄露),这是首个专注于语义泄露的数据集,包含1,130个经过人工验证的样本,涵盖多种场景,并配备了一个新颖的自动评估框架。实验表明,DeLeaker在所有基线方法中表现一致优异,即使这些方法获得了外部信息,也能在不牺牲保真度或质量的情况下有效缓解泄露。这些结果凸显了注意力控制的价值,为开发语义更精确的T2I模型铺平了道路。
English
Text-to-Image (T2I) models have advanced rapidly, yet they remain vulnerable
to semantic leakage, the unintended transfer of semantically related features
between distinct entities. Existing mitigation strategies are often
optimization-based or dependent on external inputs. We introduce DeLeaker, a
lightweight, optimization-free inference-time approach that mitigates leakage
by directly intervening on the model's attention maps. Throughout the diffusion
process, DeLeaker dynamically reweights attention maps to suppress excessive
cross-entity interactions while strengthening the identity of each entity. To
support systematic evaluation, we introduce SLIM (Semantic Leakage in IMages),
the first dataset dedicated to semantic leakage, comprising 1,130
human-verified samples spanning diverse scenarios, together with a novel
automatic evaluation framework. Experiments demonstrate that DeLeaker
consistently outperforms all baselines, even when they are provided with
external information, achieving effective leakage mitigation without
compromising fidelity or quality. These results underscore the value of
attention control and pave the way for more semantically precise T2I models.