ChatPaper.aiChatPaper

无数据流图蒸馏

Flow Map Distillation Without Data

November 24, 2025
作者: Shangyuan Tong, Nanye Ma, Saining Xie, Tommi Jaakkola
cs.AI

摘要

当前顶尖的流模型虽能生成卓越质量的结果,却依赖缓慢的迭代采样过程。为加速采样,可从预训练教师模型中蒸馏出流映射,而传统方法需依赖外部数据集进行采样。我们认为这种数据依赖性会引发根本性的"教师-数据失配"风险——静态数据集可能无法完整甚至偏离地反映教师模型的全部生成能力。这促使我们反思数据依赖是否真是流映射蒸馏成功的必要条件。本研究探索了一种无需外部数据的替代方案:仅从先验分布中采样(该分布经构造可确保与教师模型兼容),从而彻底规避失配风险。为验证这一理念的可行性,我们提出了一个原则性框架,既能预测教师模型的采样路径,又能主动修正自身误差累积以保证高保真度。我们的方法显著超越了所有基于数据的方案,以明显优势确立了新标杆。具体而言,基于SiT-XL/2+REPA的蒸馏在ImageNet 256×256分辨率上达到1.45的FID指标,在512×512分辨率上达1.49,且均仅需1次采样步数。本研究希望为生成模型加速建立更稳健的范式,推动无需数据的流映射蒸馏技术获得更广泛采纳。
English
State-of-the-art flow models achieve remarkable quality but require slow, iterative sampling. To accelerate this, flow maps can be distilled from pre-trained teachers, a procedure that conventionally requires sampling from an external dataset. We argue that this data-dependency introduces a fundamental risk of Teacher-Data Mismatch, as a static dataset may provide an incomplete or even misaligned representation of the teacher's full generative capabilities. This leads us to question whether this reliance on data is truly necessary for successful flow map distillation. In this work, we explore a data-free alternative that samples only from the prior distribution, a distribution the teacher is guaranteed to follow by construction, thereby circumventing the mismatch risk entirely. To demonstrate the practical viability of this philosophy, we introduce a principled framework that learns to predict the teacher's sampling path while actively correcting for its own compounding errors to ensure high fidelity. Our approach surpasses all data-based counterparts and establishes a new state-of-the-art by a significant margin. Specifically, distilling from SiT-XL/2+REPA, our method reaches an impressive FID of 1.45 on ImageNet 256x256, and 1.49 on ImageNet 512x512, both with only 1 sampling step. We hope our work establishes a more robust paradigm for accelerating generative models and motivates the broader adoption of flow map distillation without data.
PDF52February 7, 2026