DeLeaker:動態推理時重權衡以減輕文本至圖像模型中的語義洩漏
DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation in Text-to-Image Models
October 16, 2025
作者: Mor Ventura, Michael Toker, Or Patashnik, Yonatan Belinkov, Roi Reichart
cs.AI
摘要
文本到圖像(T2I)模型已迅速發展,但它們仍易受語義洩漏的影響,即不同實體之間無意中傳遞語義相關特徵。現有的緩解策略通常基於優化或依賴外部輸入。我們引入了DeLeaker,這是一種輕量級、無需優化的推理時方法,通過直接干預模型的注意力圖來緩解洩漏。在擴散過程中,DeLeaker動態重新加權注意力圖,以抑制過度的跨實體交互,同時強化每個實體的身份。為了支持系統性評估,我們引入了SLIM(圖像中的語義洩漏),這是首個專注於語義洩漏的數據集,包含1,130個經過人工驗證的樣本,涵蓋多種場景,並配備了一個新穎的自動評估框架。實驗表明,DeLeaker始終優於所有基線方法,即使這些方法提供了外部信息,也能在不影響保真度或質量的情況下有效緩解洩漏。這些結果凸顯了注意力控制的價值,並為更語義精確的T2I模型鋪平了道路。
English
Text-to-Image (T2I) models have advanced rapidly, yet they remain vulnerable
to semantic leakage, the unintended transfer of semantically related features
between distinct entities. Existing mitigation strategies are often
optimization-based or dependent on external inputs. We introduce DeLeaker, a
lightweight, optimization-free inference-time approach that mitigates leakage
by directly intervening on the model's attention maps. Throughout the diffusion
process, DeLeaker dynamically reweights attention maps to suppress excessive
cross-entity interactions while strengthening the identity of each entity. To
support systematic evaluation, we introduce SLIM (Semantic Leakage in IMages),
the first dataset dedicated to semantic leakage, comprising 1,130
human-verified samples spanning diverse scenarios, together with a novel
automatic evaluation framework. Experiments demonstrate that DeLeaker
consistently outperforms all baselines, even when they are provided with
external information, achieving effective leakage mitigation without
compromising fidelity or quality. These results underscore the value of
attention control and pave the way for more semantically precise T2I models.