ChatPaper.aiChatPaper

通过边缘数据迁移蒸馏实现少步数3D生成

Few-step Flow for 3D Generation via Marginal-Data Transport Distillation

September 4, 2025
作者: Zanwei Zhou, Taoran Yi, Jiemin Fang, Chen Yang, Lingxi Xie, Xinggang Wang, Wei Shen, Qi Tian
cs.AI

摘要

基于流的3D生成模型在推理过程中通常需要数十次采样步骤。尽管少步蒸馏方法,特别是一致性模型(CMs),在加速2D扩散模型方面取得了显著进展,但在更复杂的3D生成任务中仍未被充分探索。在本研究中,我们提出了一种新颖的框架——MDT-dist,用于少步3D流蒸馏。我们的方法基于一个主要目标:蒸馏预训练模型以学习边缘数据运输。直接学习这一目标需要整合速度场,而这一积分难以实现。因此,我们提出了两个可优化的目标——速度匹配(VM)和速度蒸馏(VD),分别将优化目标从运输层面等价转换为速度和分布层面。速度匹配(VM)旨在稳定地匹配学生模型与教师模型之间的速度场,但不可避免地提供了有偏的梯度估计。速度蒸馏(VD)则通过利用已学习的速度场进行概率密度蒸馏,进一步优化了过程。在评估先驱3D生成框架TRELLIS时,我们的方法将每个流变压器的采样步骤从25步减少到1或2步,在A800上实现了0.68秒(1步×2)和0.94秒(2步×2)的延迟,分别带来了9.0倍和6.5倍的加速,同时保持了高视觉和几何保真度。大量实验表明,我们的方法显著优于现有的CM蒸馏方法,并使TRELLIS在少步3D生成中实现了卓越性能。
English
Flow-based 3D generation models typically require dozens of sampling steps during inference. Though few-step distillation methods, particularly Consistency Models (CMs), have achieved substantial advancements in accelerating 2D diffusion models, they remain under-explored for more complex 3D generation tasks. In this study, we propose a novel framework, MDT-dist, for few-step 3D flow distillation. Our approach is built upon a primary objective: distilling the pretrained model to learn the Marginal-Data Transport. Directly learning this objective needs to integrate the velocity fields, while this integral is intractable to be implemented. Therefore, we propose two optimizable objectives, Velocity Matching (VM) and Velocity Distillation (VD), to equivalently convert the optimization target from the transport level to the velocity and the distribution level respectively. Velocity Matching (VM) learns to stably match the velocity fields between the student and the teacher, but inevitably provides biased gradient estimates. Velocity Distillation (VD) further enhances the optimization process by leveraging the learned velocity fields to perform probability density distillation. When evaluated on the pioneer 3D generation framework TRELLIS, our method reduces sampling steps of each flow transformer from 25 to 1 or 2, achieving 0.68s (1 step x 2) and 0.94s (2 steps x 2) latency with 9.0x and 6.5x speedup on A800, while preserving high visual and geometric fidelity. Extensive experiments demonstrate that our method significantly outperforms existing CM distillation methods, and enables TRELLIS to achieve superior performance in few-step 3D generation.
PDF81September 5, 2025