ChatPaper.aiChatPaper

标题:CAPTAIN:面向文本到图像扩散模型记忆缓解的语义特征注入方法

CAPTAIN: Semantic Feature Injection for Memorization Mitigation in Text-to-Image Diffusion Models

December 11, 2025
作者: Tong Zhang, Carlos Hinojosa, Bernard Ghanem
cs.AI

摘要

扩散模型可能无意间复现训练样本,随着这类系统被大规模部署,引发了隐私和版权方面的担忧。现有的推理时缓解方法通常通过操纵无分类器引导机制或扰动提示嵌入来实现,但这些方法往往难以在降低记忆化程度的同时保持与条件提示的良好对齐。我们提出CAPTAIN这一免训练框架,通过在去噪过程中直接修改潜在特征来缓解记忆化问题。该框架首先应用基于频率的噪声初始化,以降低去噪早期阶段复制记忆化模式的倾向;随后识别特征注入的最佳去噪时间步并定位记忆化区域;最后将非记忆化参考图像中语义对齐的特征注入定位的潜在区域,在抑制记忆化的同时保持提示忠实度和视觉质量。实验表明,相较于基于无分类器引导的基线方法,CAPTAIN在保持与目标提示强对齐的同时,实现了记忆化程度的显著降低。
English
Diffusion models can unintentionally reproduce training examples, raising privacy and copyright concerns as these systems are increasingly deployed at scale. Existing inference-time mitigation methods typically manipulate classifier-free guidance (CFG) or perturb prompt embeddings; however, they often struggle to reduce memorization without compromising alignment with the conditioning prompt. We introduce CAPTAIN, a training-free framework that mitigates memorization by directly modifying latent features during denoising. CAPTAIN first applies frequency-based noise initialization to reduce the tendency to replicate memorized patterns early in the denoising process. It then identifies the optimal denoising timesteps for feature injection and localizes memorized regions. Finally, CAPTAIN injects semantically aligned features from non-memorized reference images into localized latent regions, suppressing memorization while preserving prompt fidelity and visual quality. Our experiments show that CAPTAIN achieves substantial reductions in memorization compared to CFG-based baselines while maintaining strong alignment with the intended prompt.
PDF52December 17, 2025