適応的マッチング蒸留による少数ステップ生成の最適化

要旨

Distribution Matching Distillation (DMD) は強力な高速化手法であるが、その安定性は「禁止領域」においてしばしば損なわれる。禁止領域とは、実教師が信頼性の低い指導を提供し、偽教師が不十分な反発力しか及ぼさない領域である。本研究では、従来手法をこれらの劣化領域を回避する暗黙的戦略として再解釈する統合最適化フレームワークを提案する。この知見に基づき、報酬代理を利用して禁止領域を明示的に検出・脱出する自己補正機構であるAdaptive Matching Distillation (AMD)を導入する。AMDは構造的信号分解により補正勾配を動的に優先し、失敗モード収束に対する急峻なエネルギーバリアを強化する反発的ランドスケープシャープニングを導入する。画像・動画生成タスク（SDXL、Wan2.1等）および厳密なベンチマーク（VBench、GenEval等）における広範な実験により、AMDが生成サンプルの忠実度と訓練の頑健性を大幅に向上させることを実証する。例えば、AMDはSDXLにおけるHPSv2スコアを30.64から31.25に改善し、最先端のベースライン手法を上回る。これらの結果は、禁止領域内での最適化軌道を明示的に補正することが、少数ステップ生成モデルの性能限界を押し上げる上で本質的に重要であることを裏付ける。

English

Distribution Matching Distillation (DMD) is a powerful acceleration paradigm, yet its stability is often compromised in Forbidden Zone, regions where the real teacher provides unreliable guidance while the fake teacher exerts insufficient repulsive force. In this work, we propose a unified optimization framework that reinterprets prior art as implicit strategies to avoid these corrupted regions. Based on this insight, we introduce Adaptive Matching Distillation (AMD), a self-correcting mechanism that utilizes reward proxies to explicitly detect and escape Forbidden Zones. AMD dynamically prioritizes corrective gradients via structural signal decomposition and introduces Repulsive Landscape Sharpening to enforce steep energy barriers against failure mode collapse. Extensive experiments across image and video generation tasks (e.g., SDXL, Wan2.1) and rigorous benchmarks (e.g., VBench, GenEval) demonstrate that AMD significantly enhances sample fidelity and training robustness. For instance, AMD improves the HPSv2 score on SDXL from 30.64 to 31.25, outperforming state-of-the-art baselines. These findings validate that explicitly rectifying optimization trajectories within Forbidden Zones is essential for pushing the performance ceiling of few-step generative models.

適応的マッチング蒸留による少数ステップ生成の最適化

Optimizing Few-Step Generation with Adaptive Matching Distillation

要旨

Support