Optimalisatie van Few-Step Generatie met Adaptieve Matching Distillatie

Samenvatting

Distribution Matching Distillation (DMD) is een krachtig versnellingsparadigma, maar de stabiliteit ervan wordt vaak aangetast in de Verboden Zone, regio's waar de echte leraar onbetrouwbare begeleiding biedt terwijl de kunstmatige leraar onvoldoende afstotende kracht uitoefent. In dit werk stellen we een geünificeerd optimalisatiekader voor dat eerdere technieken herinterpreteert als impliciete strategieën om deze aangetaste regio's te vermijden. Gebaseerd op dit inzicht introduceren we Adaptive Matching Distillation (AMD), een zelfcorrigerend mechanisme dat beloningsproxies gebruikt om Verboden Zones expliciet te detecteren en te ontvluchten. AMD prioriteert dynamisch corrigerende gradiënten via structurele signaalontleding en introduceert Repulsive Landscape Sharpening om steile energiebarrières af te dwingen tegen instorting in faalmodi. Uitgebreide experimenten in beeld- en videogeneratietaken (bijv. SDXL, Wan2.1) en rigoureuze benchmarks (bijv. VBench, GenEval) tonen aan dat AMD de sample-getrouwheid en trainingsrobuustheid aanzienlijk verbetert. AMD verbetert bijvoorbeeld de HPSv2-score op SDXL van 30.64 naar 31.25, wat state-of-the-art baseline-methoden overtreft. Deze bevindingen valideren dat het expliciet corrigeren van optimalisatietrajecten binnen Verboden Zones essentieel is om de prestatielimiet van generatieve modellen met weinig stappen te verleggen.

English

Distribution Matching Distillation (DMD) is a powerful acceleration paradigm, yet its stability is often compromised in Forbidden Zone, regions where the real teacher provides unreliable guidance while the fake teacher exerts insufficient repulsive force. In this work, we propose a unified optimization framework that reinterprets prior art as implicit strategies to avoid these corrupted regions. Based on this insight, we introduce Adaptive Matching Distillation (AMD), a self-correcting mechanism that utilizes reward proxies to explicitly detect and escape Forbidden Zones. AMD dynamically prioritizes corrective gradients via structural signal decomposition and introduces Repulsive Landscape Sharpening to enforce steep energy barriers against failure mode collapse. Extensive experiments across image and video generation tasks (e.g., SDXL, Wan2.1) and rigorous benchmarks (e.g., VBench, GenEval) demonstrate that AMD significantly enhances sample fidelity and training robustness. For instance, AMD improves the HPSv2 score on SDXL from 30.64 to 31.25, outperforming state-of-the-art baselines. These findings validate that explicitly rectifying optimization trajectories within Forbidden Zones is essential for pushing the performance ceiling of few-step generative models.

Optimalisatie van Few-Step Generatie met Adaptieve Matching Distillatie

Optimizing Few-Step Generation with Adaptive Matching Distillation

Samenvatting

Support