ChatPaper.aiChatPaper

扩散模板:可控扩散的统一插件框架

Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion

April 27, 2026
作者: Zhongjie Duan, Hong Zhang, Yingda Chen
cs.AI

摘要

可控扩散方法显著拓展了扩散模型的实际应用场景,但这些方法通常被开发为相互独立、与特定骨干网络绑定的系统,其训练流程、参数格式和运行时钩子互不兼容。这种碎片化现状导致跨任务的基础设施复用、跨骨干网络的能力迁移,或在单一生成流程中组合多重控制变得极为困难。我们提出扩散模板(Diffusion Templates),这是一个统一开放的插件框架,将基础模型推理与可控能力注入进行解耦。该框架围绕三个核心组件构建:模板模型(将任意任务特定输入映射为中间能力表示)、模板缓存(作为能力注入的标准化接口)以及模板流水线(负责加载、融合并将多个模板缓存注入基础扩散运行时)。由于该接口在系统层面定义而非绑定特定控制架构,因此KV缓存、LoRA等异构能力载体可在同一抽象框架下获得支持。基于此设计,我们构建了涵盖结构控制、亮度调节、色彩调整、图像编辑、超分辨率、锐度增强、审美对齐、内容参照、局部修复及年龄控制等功能的多样化模型库。案例研究表明,扩散模板能够在快速迭代的扩散骨干网络中保持模块化、可组合性及实际可扩展性的同时,统一广泛的可控生成任务。所有资源包括代码、模型和数据集将全面开源。
English
Controllable diffusion methods have substantially expanded the practical utility of diffusion models, but they are typically developed as isolated, backbone-specific systems with incompatible training pipelines, parameter formats, and runtime hooks. This fragmentation makes it difficult to reuse infrastructure across tasks, transfer capabilities across backbones, or compose multiple controls within a single generation pipeline. We present Diffusion Templates, a unified and open plugin framework that decouples base-model inference from controllable capability injection. The framework is organized around three components: Template models that map arbitrary task-specific inputs to an intermediate capability representation, a Template cache that functions as a standardized interface for capability injection, and a Template pipeline that loads, merges, and injects one or more Template caches into the base diffusion runtime. Because the interface is defined at the systems level rather than tied to a specific control architecture, heterogeneous capability carriers such as KV-Cache and LoRA can be supported under the same abstraction. Based on this design, we build a diverse model zoo spanning structural control, brightness adjustment, color adjustment, image editing, super-resolution, sharpness enhancement, aesthetic alignment, content reference, local inpainting, and age control. These case studies show that Diffusion Templates can unify a broad range of controllable generation tasks while preserving modularity, composability, and practical extensibility across rapidly evolving diffusion backbones. All resources will be open sourced, including code, models, and datasets.
PDF62May 1, 2026