扩散模板:可控扩散的统一插件框架
Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion
April 27, 2026
作者: Zhongjie Duan, Hong Zhang, Yingda Chen
cs.AI
摘要
可控扩散方法显著拓展了扩散模型的实际应用场景,但这些方法通常被开发为相互独立、与特定骨干网络绑定的系统,其训练流程、参数格式和运行时钩子互不兼容。这种碎片化现状导致跨任务的基础设施复用、跨骨干网络的能力迁移,或在单一生成流程中组合多重控制变得极为困难。我们提出扩散模板(Diffusion Templates),这是一个统一开放的插件框架,将基础模型推理与可控能力注入进行解耦。该框架围绕三个核心组件构建:模板模型(将任意任务特定输入映射为中间能力表示)、模板缓存(作为能力注入的标准化接口)以及模板流水线(负责加载、融合并将多个模板缓存注入基础扩散运行时)。由于该接口在系统层面定义而非绑定特定控制架构,因此KV缓存、LoRA等异构能力载体可在同一抽象框架下获得支持。基于此设计,我们构建了涵盖结构控制、亮度调节、色彩调整、图像编辑、超分辨率、锐度增强、审美对齐、内容参照、局部修复及年龄控制等功能的多样化模型库。案例研究表明,扩散模板能够在快速迭代的扩散骨干网络中保持模块化、可组合性及实际可扩展性的同时,统一广泛的可控生成任务。所有资源包括代码、模型和数据集将全面开源。
English
Controllable diffusion methods have substantially expanded the practical utility of diffusion models, but they are typically developed as isolated, backbone-specific systems with incompatible training pipelines, parameter formats, and runtime hooks. This fragmentation makes it difficult to reuse infrastructure across tasks, transfer capabilities across backbones, or compose multiple controls within a single generation pipeline. We present Diffusion Templates, a unified and open plugin framework that decouples base-model inference from controllable capability injection. The framework is organized around three components: Template models that map arbitrary task-specific inputs to an intermediate capability representation, a Template cache that functions as a standardized interface for capability injection, and a Template pipeline that loads, merges, and injects one or more Template caches into the base diffusion runtime. Because the interface is defined at the systems level rather than tied to a specific control architecture, heterogeneous capability carriers such as KV-Cache and LoRA can be supported under the same abstraction. Based on this design, we build a diverse model zoo spanning structural control, brightness adjustment, color adjustment, image editing, super-resolution, sharpness enhancement, aesthetic alignment, content reference, local inpainting, and age control. These case studies show that Diffusion Templates can unify a broad range of controllable generation tasks while preserving modularity, composability, and practical extensibility across rapidly evolving diffusion backbones. All resources will be open sourced, including code, models, and datasets.