ChatPaper.aiChatPaper

TAG:用于抗幻觉扩散采样的切向放大引导

TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling

October 6, 2025
作者: Hyunmin Cho, Donghoon Ahn, Susung Hong, Jee Eun Kim, Seungryong Kim, Kyong Hwan Jin
cs.AI

摘要

近期,扩散模型在图像生成领域取得了顶尖性能,但常常面临语义不一致或幻觉问题。尽管多种推理时引导方法能够提升生成质量,它们通常依赖外部信号或架构修改间接操作,这引入了额外的计算开销。本文提出了一种更为高效且直接的引导方法——切向放大引导(TAG),该方法仅基于轨迹信号运作,无需改动底层扩散模型。TAG利用中间样本作为投影基础,并放大估计得分相对于该基础的切向分量,以校正采样轨迹。我们通过一阶泰勒展开形式化这一引导过程,证明放大切向分量能够引导状态向更高概率区域移动,从而减少不一致性并提升样本质量。TAG作为一种即插即用、架构无关的模块,以极小的计算代价提高了扩散采样的保真度,为扩散引导提供了新的视角。
English
Recent diffusion models achieve the state-of-the-art performance in image generation, but often suffer from semantic inconsistencies or hallucinations. While various inference-time guidance methods can enhance generation, they often operate indirectly by relying on external signals or architectural modifications, which introduces additional computational overhead. In this paper, we propose Tangential Amplifying Guidance (TAG), a more efficient and direct guidance method that operates solely on trajectory signals without modifying the underlying diffusion model. TAG leverages an intermediate sample as a projection basis and amplifies the tangential components of the estimated scores with respect to this basis to correct the sampling trajectory. We formalize this guidance process by leveraging a first-order Taylor expansion, which demonstrates that amplifying the tangential component steers the state toward higher-probability regions, thereby reducing inconsistencies and enhancing sample quality. TAG is a plug-and-play, architecture-agnostic module that improves diffusion sampling fidelity with minimal computational addition, offering a new perspective on diffusion guidance.
PDF475October 13, 2025