噪声一致性训练:一种原生方法,用于一步生成器学习附加控制
Noise Consistency Training: A Native Approach for One-Step Generator in Learning Additional Controls
June 24, 2025
作者: Yihong Luo, Shuchen Xue, Tianyang Hu, Jing Tang
cs.AI
摘要
在人工智能生成内容(AIGC)领域,追求高效且可控的高质量内容生成仍是一项核心挑战。尽管通过扩散蒸馏技术实现的一步生成器在生成质量和计算效率上表现出色,但将其适应于新的控制条件——如结构约束、语义指导或外部输入——却面临重大难题。传统方法通常需要对基础模型进行计算成本高昂的修改,并随后进行扩散蒸馏。本文提出了一种新颖且轻量级的方法——噪声一致性训练(NCT),它能够直接将新的控制信号整合到预训练的一步生成器中,而无需访问原始训练图像或重新训练基础扩散模型。NCT通过引入一个适配器模块,并在生成器的噪声空间中采用噪声一致性损失来实现。该损失函数使适配后的模型在不同程度上条件依赖的噪声间保持生成行为的一致性,从而隐式引导其遵循新的控制条件。从理论上讲,这一训练目标可理解为最小化适配生成器与新条件诱导的条件分布之间的分布距离。NCT具有模块化、数据高效且易于部署的特点,仅依赖于预训练的一步生成器和控制信号模型。大量实验证明,NCT在单次前向传播中实现了最先进的可控生成,在生成质量和计算效率上均超越了现有的多步和基于蒸馏的方法。代码已发布于https://github.com/Luo-Yihong/NCT。
English
The pursuit of efficient and controllable high-quality content generation
remains a central challenge in artificial intelligence-generated content
(AIGC). While one-step generators, enabled by diffusion distillation
techniques, offer excellent generation quality and computational efficiency,
adapting them to new control conditions--such as structural constraints,
semantic guidelines, or external inputs--poses a significant challenge.
Conventional approaches often necessitate computationally expensive
modifications to the base model and subsequent diffusion distillation. This
paper introduces Noise Consistency Training (NCT), a novel and lightweight
approach to directly integrate new control signals into pre-trained one-step
generators without requiring access to original training images or retraining
the base diffusion model. NCT operates by introducing an adapter module and
employs a noise consistency loss in the noise space of the generator. This loss
aligns the adapted model's generation behavior across noises that are
conditionally dependent to varying degrees, implicitly guiding it to adhere to
the new control. Theoretically, this training objective can be understood as
minimizing the distributional distance between the adapted generator and the
conditional distribution induced by the new conditions. NCT is modular,
data-efficient, and easily deployable, relying only on the pre-trained one-step
generator and a control signal model. Extensive experiments demonstrate that
NCT achieves state-of-the-art controllable generation in a single forward pass,
surpassing existing multi-step and distillation-based methods in both
generation quality and computational efficiency. Code is available at
https://github.com/Luo-Yihong/NCT