MANI-Pure：面向对抗净化的幅度自适应噪声注入

摘要

基于扩散模型的对抗净化已成为一种颇具前景的防御策略，但现有方法通常依赖于均匀噪声注入，这种不加区分地扰动所有频率的方式会破坏语义结构并削弱鲁棒性。我们的实证研究表明，对抗扰动并非均匀分布：它们主要集中于高频区域，且在不同频率和攻击类型间呈现出异质的幅值强度模式。受此启发，我们提出了MANI-Pure，一种幅值自适应的净化框架，该框架利用输入的幅值谱来指导净化过程。与注入同质噪声不同，MANI-Pure自适应地应用异质、频率定向的噪声，有效抑制了脆弱高频低幅值频段中的对抗扰动，同时保留了语义关键的低频内容。在CIFAR-10和ImageNet-1K上的大量实验验证了MANI-Pure的有效性。它将干净准确率与原始分类器的差距缩小至0.59以内，同时将鲁棒准确率提升了2.15，并在RobustBench排行榜上取得了最高的鲁棒准确率，超越了之前的最先进方法。

English

Adversarial purification with diffusion models has emerged as a promising defense strategy, but existing methods typically rely on uniform noise injection, which indiscriminately perturbs all frequencies, corrupting semantic structures and undermining robustness. Our empirical study reveals that adversarial perturbations are not uniformly distributed: they are predominantly concentrated in high-frequency regions, with heterogeneous magnitude intensity patterns that vary across frequencies and attack types. Motivated by this observation, we introduce MANI-Pure, a magnitude-adaptive purification framework that leverages the magnitude spectrum of inputs to guide the purification process. Instead of injecting homogeneous noise, MANI-Pure adaptively applies heterogeneous, frequency-targeted noise, effectively suppressing adversarial perturbations in fragile high-frequency, low-magnitude bands while preserving semantically critical low-frequency content. Extensive experiments on CIFAR-10 and ImageNet-1K validate the effectiveness of MANI-Pure. It narrows the clean accuracy gap to within 0.59 of the original classifier, while boosting robust accuracy by 2.15, and achieves the top-1 robust accuracy on the RobustBench leaderboard, surpassing the previous state-of-the-art method.

MANI-Pure：面向对抗净化的幅度自适应噪声注入

MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification

摘要

Support