擴散模板:可控擴散的統一外掛程式框架
Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion
April 27, 2026
作者: Zhongjie Duan, Hong Zhang, Yingda Chen
cs.AI
摘要
可控擴散方法顯著提升了擴散模型的實用價值,但現有方案通常以相互孤立的專用系統形式存在,具有互不相容的訓練流程、參數格式和運行時掛鈎。這種碎片化現狀導致難以跨任務復用基礎設施、跨骨幹網絡遷移能力,或在單一生成管道中組合多種控制條件。我們提出「擴散模板」——一個統一開源的插件框架,通過將基礎模型推理與可控能力注入解耦來解決上述問題。該框架包含三個核心組件:將任意任務特定輸入映射為中間能力表徵的模板模型、作為標準化能力注入接口的模板緩存,以及負責加載、融合並將多個模板緩存注入基礎擴散運行時的模板管道。由於接口定義在系統層面而非綁定特定控制架構,KV緩存與LoRA等異構能力載體可在同一抽象層下獲得支持。基於此設計,我們構建了覆蓋結構控制、亮度調節、色彩調整、圖像編輯、超分辨率、銳化增強、美學對齊、內容參照、局部修復和年齡控制的多樣化模型庫。案例研究表明,擴散模板能在保持模塊化、可組合性與實踐可擴展性的同時,統一各類可控生成任務,並適應快速迭代的擴散模型骨幹網絡。所有資源包括代碼、模型和數據集均將開源。
English
Controllable diffusion methods have substantially expanded the practical utility of diffusion models, but they are typically developed as isolated, backbone-specific systems with incompatible training pipelines, parameter formats, and runtime hooks. This fragmentation makes it difficult to reuse infrastructure across tasks, transfer capabilities across backbones, or compose multiple controls within a single generation pipeline. We present Diffusion Templates, a unified and open plugin framework that decouples base-model inference from controllable capability injection. The framework is organized around three components: Template models that map arbitrary task-specific inputs to an intermediate capability representation, a Template cache that functions as a standardized interface for capability injection, and a Template pipeline that loads, merges, and injects one or more Template caches into the base diffusion runtime. Because the interface is defined at the systems level rather than tied to a specific control architecture, heterogeneous capability carriers such as KV-Cache and LoRA can be supported under the same abstraction. Based on this design, we build a diverse model zoo spanning structural control, brightness adjustment, color adjustment, image editing, super-resolution, sharpness enhancement, aesthetic alignment, content reference, local inpainting, and age control. These case studies show that Diffusion Templates can unify a broad range of controllable generation tasks while preserving modularity, composability, and practical extensibility across rapidly evolving diffusion backbones. All resources will be open sourced, including code, models, and datasets.