图像扩散预览与一致性求解器
Image Diffusion Preview with Consistency Solver
December 15, 2025
作者: Fu-Yun Wang, Hao Zhou, Liangzhe Yuan, Sanghyun Woo, Boqing Gong, Bohyung Han, Ming-Hsuan Yang, Han Zhang, Yukun Zhu, Ting Liu, Long Zhao
cs.AI
摘要
图像扩散模型的缓慢推理过程严重影响了交互式用户体验。为解决此问题,我们提出Diffusion Preview创新范式,通过快速低步数采样生成初步预览结果供用户评估,待预览效果满意后再进行全步数精细化处理。现有加速方法(包括免训练求解器和训练后蒸馏技术)难以同时实现高质量预览和预览-最终输出的一致性。我们基于通用线性多步法提出ConsistencySolver——一种通过强化学习优化的轻量级可训练高阶求解器,能显著提升预览质量与一致性。实验结果表明,该求解器在低步数场景下大幅提升生成质量与一致性,特别适用于高效预览-优化工作流。值得注意的是,其FID分数与多步DPM-Solver相当但步数减少47%,同时优于蒸馏基线方法。用户研究表明,本方法在保持生成质量的同时将用户总交互时间缩短近50%。代码已开源:https://github.com/G-U-N/consolver。
English
The slow inference process of image diffusion models significantly degrades interactive user experiences. To address this, we introduce Diffusion Preview, a novel paradigm employing rapid, low-step sampling to generate preliminary outputs for user evaluation, deferring full-step refinement until the preview is deemed satisfactory. Existing acceleration methods, including training-free solvers and post-training distillation, struggle to deliver high-quality previews or ensure consistency between previews and final outputs. We propose ConsistencySolver derived from general linear multistep methods, a lightweight, trainable high-order solver optimized via Reinforcement Learning, that enhances preview quality and consistency. Experimental results demonstrate that ConsistencySolver significantly improves generation quality and consistency in low-step scenarios, making it ideal for efficient preview-and-refine workflows. Notably, it achieves FID scores on-par with Multistep DPM-Solver using 47% fewer steps, while outperforming distillation baselines. Furthermore, user studies indicate our approach reduces overall user interaction time by nearly 50% while maintaining generation quality. Code is available at https://github.com/G-U-N/consolver.