基于自适应匹配蒸馏的少步生成优化

摘要

分佈匹配蒸餿（DMD）作為一種高效的加速範式，其穩定性在「禁區」內常面臨挑戰——這些區域中真實教師提供不可靠指導，而虛擬教師產生的排斥力不足。本研究提出統一優化框架，將現有方法重新闡釋為避開此類失真區域的隱式策略。基於此洞見，我們引入自校正機制「自適應匹配蒸餿」（AMD），利用獎勵代理顯式檢測並逃離禁區。AMD通過結構化信號分解動態優先校正梯度，並採用排斥景觀銳化技術構建陡峭能量壁壘以抵禦失敗模式坍縮。在圖像與視頻生成任務（如SDXL、Wan2.1）及嚴謹基準測試（如VBench、GenEval）上的大量實驗表明，AMD顯著提升樣本保真度與訓練魯棒性。例如，AMD將SDXL的HPSv2分數從30.64提升至31.25，超越現有頂尖基準方法。這些發現驗證了在禁區內顯式修正優化軌跡對於突破少步數生成模型性能瓶頸的關鍵作用。

English

Distribution Matching Distillation (DMD) is a powerful acceleration paradigm, yet its stability is often compromised in Forbidden Zone, regions where the real teacher provides unreliable guidance while the fake teacher exerts insufficient repulsive force. In this work, we propose a unified optimization framework that reinterprets prior art as implicit strategies to avoid these corrupted regions. Based on this insight, we introduce Adaptive Matching Distillation (AMD), a self-correcting mechanism that utilizes reward proxies to explicitly detect and escape Forbidden Zones. AMD dynamically prioritizes corrective gradients via structural signal decomposition and introduces Repulsive Landscape Sharpening to enforce steep energy barriers against failure mode collapse. Extensive experiments across image and video generation tasks (e.g., SDXL, Wan2.1) and rigorous benchmarks (e.g., VBench, GenEval) demonstrate that AMD significantly enhances sample fidelity and training robustness. For instance, AMD improves the HPSv2 score on SDXL from 30.64 to 31.25, outperforming state-of-the-art baselines. These findings validate that explicitly rectifying optimization trajectories within Forbidden Zones is essential for pushing the performance ceiling of few-step generative models.

基于自适应匹配蒸馏的少步生成优化

Optimizing Few-Step Generation with Adaptive Matching Distillation

摘要

Support